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Preface 



In 1975 Szemeredi proved the long-standing conjecture of Erdos and Turan 
that any subset of Z having positive upper Banach density contains arbitrarily 
long arithmetic progressions. Szemeredi's proof was entirely combinatorial, 
but two years later Furstenberg gave a quite different proof of Szemeredi's 
Theorem by first showing its equivalence to an ergodic-theoretic assertion 
of multiple recurrence, and then bringing new machinery in ergodic theory 
to bear on proving that. His ergodic-theoretic approach subsequently yielded 
several other results in extremal combinatorics, as well as revealing a range of 
new phenomena according to which the structures of probability -preserving 
systems can be described and classified. 

In this work I survey some recent advances in understanding these ergodic- 
theoretic structures. It contains proofs of the norm convergence of the 'non- 
conventional' ergodic averages that underly Furstenberg 's approach to vari- 
ants of Szemeredi's Theorem, and of two of the recurrence theorems of Fursten- 
berg and Katznelson: the Multidimensional Multiple Recurrence Theorem, 
which implies a multidimensional generalization of Szemeredi's Theorem; 
and a density version of the Hales- Jewett Theorem of Ramsey Theory. 

* * * 

The text below was originally submitted as my Ph.D. dissertation at UCLA, 
after being assembled from a number of earlier papers. It seems worth repeat- 
ing the acknowledgements from that dissertation as well. 

Many people deserve my thanks for their part in my mathematical educa- 
tion. Listing them in roughly the order we met, I must at least mention David 
Fremlin, Imre Leader, Tim Gowers, Bela BoUobas, Ben Garling, James Nor- 
ris, Assaf Naor, Yuval Peres, Vitaly Bergelson, Christoph Thiele, Sorin Popa, 
David Aldous, Tamar Ziegler, Bryna Kra, Bernard Host, Mariusz Lemahczyk 
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and Dan Rudolph. I could have written a much longer list, but still the se- 
lection would have been slightly arbitrary: to make it complete would have 
required far more space than I have available. 

During the same period, I have benefited from the financial support of 
Trinity College, Cambridge, the Shapiro and Huang Families through their 
UCLA graduate student fellowships, and Microsoft Corporation. No less sig- 
nificant, I have been able to rely unquestioningly on the support of family and 
friends, for whom I can only hope to be so generous in turn should the need 
arise. 

Terence Tao, who advised this dissertation, has certainly taught me more 
during the last four years than either of us fully appreciates, and his energy 
and enthusiasm for mathematics are a constant motivation for those around 
him. 

Venice Beach, California 
May 2010 
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Chapter 1 
Introduction 



The concerns of this work stem from the following remarkable result of Sze- 
meredi ( HSzeVSID . which confirmed an old conjecture of Erdos and Turan ( IIET36II '). 

Szemeredi's Theorem. For any 5 > and k > 1 there is some Nq > 1 
such that ifN > Nq then any AC {1,2,3,..., N} with \ A\ > 6N includes a 
nontrivial k-term arithmetic progression: A ^ {a, a + n, . . . ,a + {k — l)n} 
for some a G {1, 2, ... , A^} and n > 1. 

This provides a considerable strengthening of a much older result of van 
der Waerden [|Wae271 . according to which any colouring of N using abounded 
number of colours witnesses arbitrarily long finite arithmetic progressions that 
are monochromatic. Since any colouring with at most c colours must have at 
least one colour class of upper Banach density at least 1/c, van der Waerden's 
Theorem can be deduced by applying Szemeredi's Theorem to the intersec- 
tion of that class with sufficiently long discrete intervals in N. 

Shortly after the appearance of Szemeredi's ingenious combinatorial proof, 
Furstenberg gave a new proof of the above theorem in [|Fur77ll using a super- 
ficially quite different approach, relying on a conversion to a problem about 
probability -preserving dynamical systems. 

Such a system consists of a probability space (X, S, /i) together with an 
invertible, measurable, ^u-preserving transformation T : X — > X. Fursten- 
berg proved that all such systems enjoy a property of 'multiple recurrence': 

Multiple Recurrence Theorem. Whenever (X, S, /i) and T are as above, 
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ifk > 1 and A G S has ^{A) > then 
1 ^ 

Uminf T-"(A) n ■ ■ • n T-('="^)"(A)) > 0. 

°° n=l 

particular, there is some n > 1 such that 

fi{A n T-"(A) n ■ ■ ■ n t-('=-i)"(A)) > o. 



It is worth noting that analogously to this ergodic-theoretic proof of Sze- 
meredi's Theorem, it is possible to deduce the colouring theorem of van der 
Waerden from a multiple recurrence result in topological dynamics. We will 
not be concerned with this story here, but it is reported in detail in Fursten- 
berg's book [|Fur81H . 

Shortly after the above result appeared, Furstenberg and Katznelson re- 
alized that the same basic method could be modified to apply to collections 
of commuting measure-preserving transformations, and proved the following 
in llFKTSll . 

Theorem A (Multidimensional Multiple Recurrence Theorem) . If (X, S , /i) 

is a probability space, Ti, T2, . . . , Td are commuting measurable invertible fi- 
preserving self-maps of X and A G S has > 0, then 

1 ^ 

Uminf - ^ /x(Tf "(A) n ■ ■ • H T,-{A)) > 0. 

n=l 



Of course this result implies one-dimensional multiple recurrence by set- 
ting d := k and Tj := T* for i = 0,1, . . . , k — 1. In addition, Furstenberg 
and Katznelson were able to convert Theorem A back into a multidimensional 
combinatorial result generalizing Szemeredi's Theorem. 

Multidimensional Szemeredi Theorem. For any 6 > and d > 1 there 
is some Nq > 1 such that if N > Nq then any A C {1,2,..., A^}'^ with 
\A\ > 6N'^ includes the vertex set of the outer face of a nontrivial upright 
simplex: 

A D {sl + nei, a + ne2, . . . , a + ne^} 
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for some a G {1,2,..., A^}'^ and n > 1, where ei, e2, . . . , e^^ are the usual 
basis vectors ofU^. 

This ergodic-theoretic approach to resuks in additive combinatorics has 
since developed into a whole subdiscipline, sometimes termed 'Ergodic Ram- 
sey Theory'; see, for instance, Bergelson's survey [|Ber96L In particular, 
Furstenberg and Katznelson used this approach to prove a number of fur- 
ther results concerning some form of 'recurrence', culminating in the fol- 
lowing density version of the classical Hales-Jewett Theorem [IHJ63II proved 
in [|FK911 : 

Theorem B (Density version of the Hales-Jewett Theorem). For any 5 > Q 

and k > 1 there is some Nq > 1 such that ifN>NQ then any AC [k]^ with 
\A\ > 5k^ includes a combinatorial line : a subset L C [k]^ of the form 

L = {w E [k]^ : w|[Ar]\j = Wq, k.Wj is the same element of [k] for all j E J}, 

for some fixed nonempty J C [A^] and Wq G [/cj^^l^"^. 

In fact, this result implies most of the other main results in density Ram- 
sey Theory, including Szemeredi's Theorem and its multidimensional gener- 
alization. This implication holds exactly as in the older setting of colouring 
Ramsey Theorems, which is well-treated in the book [iGRS90il of Graham, 
Rothschild and Spencer. 

In addition to achieving some striking new combinatorial results, Ergodic 
Ramsey Theory has also motivated new ergodic-theoretic questions, and has 
witnessed an ongoing interplay between insights into these two aspects of the 
subject. 

One basic question that was resolved only recently is whether the 'multi- 
ple ergodic averages' studied in Theorems A and B above actually converge 
(that is, whether 'liminf can be replaced with 'lim'). In the case of the orig- 
inal Multiple Recurrence Theorem, this was finally shown to be so by Host 
and Kra in [ HK05II . following the establishment of several special cases and 
related results over two decades in ||CL841 ICL88a[ ICL88b[ IZha96l IFW96[ 
IHKOlll (see also Ziegler's paper [|Zie07ll for another proof of the Host- Kra 
result). The more general setting of Theorem A was then settled by Tao 
in lllioOSil . 

Theorem C (Norm convergence of nonconventional averages) . For any com- 
muting tuple of invertible measurable ^-preserving transformations Ti, T2, 
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. . . , Td nv (X, E, fi) and any functions fi, f2,...,fd G the multiple 

ergodic averages 

ndl]^ i=l 

converge in L'^{fi) as N — )■ oo. 

While the sequence of works preceding the proof of convergence in the 
one-dimensional setting of the Multiple Recurrence Theorem develops a large 
body of ergodic-theoretic machinery for the analysis of these averages, Tao 
departs quite markedly from those approaches and effectively converts the 
problem of convergence into a quantitative assertion concerning averages of 
[—1, l]-valued functions on large finite grids {1,2,..., A^}"^. 

A new proof of Tao's Theorem was given using classical ergodic-theoretic 
machinery in [|Aus09i It turns out that this convergence can be proved rel- 
atively quickly using a version of the older approaches, with the one new 
twist that starting from a system of commuting transformations of interest 
Ti,T2, ■ ■ ■ ,Tfi r\ {X, S, /i) one must first pass to a carefully-chosen extended 
system Ti, T2, . . . , r> (X, S, /t) (that is, a new system for which the orig- 
inal one is isomorphic to the action of the Tj's on some globally invariant a- 
subalgebra of S: in ergodic-theoretic terms, the original system is a 'factor' 
of the new one). If the extension is constructed correctly then the asymptotic 
behaviour of the multiple ergodic averages associated to it admits a simplifica- 
tion allowing them to be compared with a similar system of averages involving 
only k — 1 transformations; from this point convergence in follows quickly 
by induction on k. The need for this extension also offers some explanation 
for the advantage that Tao gains in his approach to Theorem C by converting 
to the finitary, combinatorial world: during the course of his proof he con- 
structs new functions from the initial data of the problem in ways that cannot 
be used to construct measurable functions in the ergodic-theoretic setting, but 
suitable measurable functions are available using the larger cr-algebra of the 
extended system. 

Theorem C proves the convergence of the scalar averages appearing in 
Theorem A because 

N ^ N d 

-^Mrr"(^)nrn^)n---nr7"(A))= / ^En(/^°^")d^ 

n=l n=l i=l 
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when /i = /2 = • • • = /d = 1a- Note that another re-proof of Tao's theo- 
rem involving non-standard analysis has been given by Towsner in [|Tow09ll . 
and that a different construction of some extensions of probability-preserving 
systems that can be used as in the proof of [iAus09ll has since been given by 
Host in [|Hos09ll . 

Having found the extended systems appearing in the new proof of The- 
orem C, it turns out that they also afford a somewhat simplified description 
of the limiting value of the scalar averages appearing in Theorem A. These 
limiting values can always be expressed in terms of a certain {d + l)-fold 
self -joining of the system (X, E, /i, Ti, T2, . . . , T^) (which appears already 
in the works of Furstenberg and Katznelson), and one finds that for the ex- 
tended system this self-joining takes a special form. Crucially, that special 
form is precisely the hypothesis required to apply another result of Tao: the 
infinitary analog of the hypergraph removal lemma from [|Tao07ll . This leads 
fairly quickly to a new proof of Theorem A (and hence also one-dimensional 
multiple recurrence and their combinatorial consequences), which appeared 
in [lAusbl . 

A similar story is now known in the setting of Theorem B. For their 
proof of that theorem, Furstenberg and Katznelson first provided a correspon- 
dence with a class of stochastic processes enjoying stationarity with respect to 
some semigroup of transformations. This is broadly similar to Furstenberg's 
original correspondence between Szemeredi's Theorem and the Multiple Re- 
currence Theorem, but differs considerably in its details. Having built this 
bridge to a class of stochastic processes, Furstenberg and Katznelson then 
used analogs of their earlier structural results from the setting of probability- 
preserving Z'^-actions to prove the 'recurrence' result that is the translation 
of Theorem B. Here, too, it turns out that the strategy of seeking extended 
systems in which the behaviour of interest is simplified leads to a new proof 
of that recurrence result, and so overall to a considerably shortened proof of 
Theorem B, where again the punchline is an implementation of Tao's infini- 
tary hypergraph removal. This new proof of Theorem B appears in HAusaL It 
was discovered simultaneously with the work of the Polymath project UPolaH , 
which provided the first finitary, effective proof of that theorem, and the proof 
of [AusaJ used a key construction discovered by the members of that project 
(again, suitably translated to apply to the stochastic processes). 

More recently still, in pursuit of some convergence results for 'polyno- 
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mial' analogs of the functional averages of Theorem C, it was found that a 
very abstract, unified approach could be given to the construction of the dif- 
ferent extensions underlying the above-mentioned proofs of Theorems C, A 
and B. This rests on the notion of a system that is 'sated' relative to another 
class of systems. In this dissertation, the new proofs of the above results are 
re-told using this unifying language, and some speculations offered concern- 
ing some further extensions of this machinery. 

Outline of the following chapters 

In the next chapter we recall some basic definitions and conventions from the 
study of measurable dynamical systems, and then introduce the chief techni- 
cal innovation on which most of the remaining chapters will rest: a special 
property of certain dynamical systems called 'satedness'. The main result of 
that chapter. Theorem 12.3.21 asserts that any probability-preserving dynami- 
cal system admits extensions that enjoy this 'satedness' (where precisely what 
this means is relative to a choice of another class of systems). 

In Chapter[3]we use the existence of sated extensions to prove Theorem C. 
After the introduction of another important technical device, the 'Furstenberg 
self -joining', this follows by a quick induction once the strategy of passing to 
a sated extension has been decided. 

Chapter |4] is dedicated to Theorem A. In this case the use of sated ex- 
tensions gives a relatively easy reduction of the proof to a case in which the 
Furstenberg self-joining (which describes the limiting averages of interest) 
admits a rather detailed structural description; but the use of that description 
to deduce the desired positivity of these averages is still rather involved. This 
requires an implementation of (a very slight modification of) Tao's 'infinitary 
hypergraph removal lemma', which we will recall for completeness. 

In ChapterOwe prove Theorem B. This proof follows very closely that of 
Theorem A, notwithstanding that the category of dynamical systems in which 
the proof takes place is very different. However, the unusual features of this 
new category will require that we quickly re-examine the existence of sated 
extensions proved in Chapter 2 to check that a slightly modified version of 
that result holds here. After recalling Furstenberg and Katznelson's origi- 
nal reformulation of Theorem B in terms of a 'recurrence' property of cer- 
tain 'strongly stationary' stochastic processes, we establish this new notion 
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of 'coordinatewise-satedness' and show that in this world it implies a simi- 
lar structure for certain joint distributions to that obtained for the Furstenberg 
self -joining in Chapter HI The proof of Theorem B is then completed by an- 
other appeal to infinitary hypergraph removal, essential identical to that in 
Chapter m 

Finally, Chapter [6] contains some speculations around an important ques- 
tion left open by our work. In the case of Z'^-actions treated by Chapters [3] 
and m one can discern in the background a very general ergodic-theoretic 
meta-question concerning the possible joinings among systems enjoying var- 
ious additional invariances. This is formulated precisely in Section l4n but 
in that section it is answered only in a special case that suffices for the proof 
of Theorem A. A more general answer would be very interesting in its own 
right, as well as potentially offering new insights on other generalizations of 
nonconventional average convergence and multiple recurrence. In Chapter |6] 
we will formulate a conjecture that would answer this question much more 
completely. 
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Chapter 2 
Setting the stage 



A handful of key technical ideas in ergodic theory will drive all of the proofs 
in the later chapters of this work. After recalling some standard definitions 
and notation in the first section below, we introduce two such key ideas: that 
of a subclass of a class of dynamical systems that has the property of be- 
ing 'idempotent', and the constructions that this assumption of idempotence 
enables; and then the possibility of a system being 'sated' relative to such an 
idempotent class, together with the result that all systems have extensions that 
are sated in this way. 

These preliminary sections provide the necessary background for Chap- 
ters 3 and 4 (and also Chapter 6). Unfortunately, the slightly unusual class of 
stochastic processes that appears in Chapter 5 is a little less willing to be anal- 
ysed using this standard framework: the key ideas of idempotence and sated- 
ness will be central there too, but only after being modified to suit that class. 
The modifications will be explained early in that chapter, together with those 
small changes that must accordingly be made to the proofs in Sections 12.21 
and l2.3[ In principle one could give a unified treatment of all of these settings, 
but only at the expense of working with quite abstractly-defined categories of 
dynamical system and operations on them, in which our basic intuitions for 
the notions recalled in Section 12.11 may become obscured. Although more 
unified, that route seems to pose too great a risk to the clarity of the other 
chapters, and so we shall only indicate it in passing during Chapter 5. 
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2.1 Probability-preserving systems 



Throughout this paper {X, E) will denote a measurable space. Since our main 
results pertain only to the joint distribution of countably many bounded real- 
valued functions on this space and their shifts under some measurable trans- 
formations, by passing to the image measure on a suitable product space we 
may always assume that {X, S) is standard Borel, and this will prove conve- 
nient for some of our later constructions. In addition, n will always denote 
a probability measure on E. We shall write {X^, E®*^) for the usual product 
measurable structure indexed by a set S, and //®'^ for the product measure and 
fjL^^ for the diagonal measure on this structure respectively. Given a measur- 
able map : {X, E) — > (F, $) to another measurable space, we shall write 
for the resulting pushforward probability measure on (F, $). 

Suppose now that F is a discrete semigroup, and consider the class of all 
probability-preserving actions T -.T r\ (X, E, //) on standard Borel probabil- 
ity spaces; these will be referred to as F-systems, and will often be denoted 
by either the quadruple (X, E, fi, T) or simply by a boldface letter such as X. 
If A < F is a subgroup we denote by T the A-action on {X, E, jj) defined 
by (T ^^)^ := T'^ for 7 e A, and refer to this as the A-subaction, and if 
X = [X, E, ji, T) is a F-system then we write similarly X for the system 
(X, T^,fi,T and refer to it as a subaction system. 

A F-system (X, E,/i, T) is trivial if /x is supported on a single point. 
Since any two such systems are measure-theoretically isomorphic simply by 
identifying these single points, we will usually refer to 'the' trivial system. 

We will make repeated use of a handful of standard constructions and 
properties of F-systems. 

Factors and joinings 

A factor of the F-system (X, E, T) is a globally T-invariant cr-subalgebra 
$ < E. Relatedly, a factor map from one F-system T : F rv (X, E, /i) 
to another S : F r\ (F, $, z/) is a measurable map vr : X — )■ Y such that 
u = TT^yuand S"''o7r = vr o T''' for all 7 G F. This situation is often signified by 
writing tt : (X, E, /x, T) — > (F, $, S). Factor maps comprise the natural 
morphisms between systems for a fixed acting semigroup. 

To any factor map tt is associated the factor {7r~^(A) : A e $} < E. 



12 



Two factor maps tt and ijj are equivalent if these cr-subalgebras of S that they 
generate are equal up to yU-negligible sets, in which case we shall write vr ~ 
this clearly defines an equivalence relation among factors. 

It is a standard fact that in the category of standard Borel spaces equiv- 
alence classes of factors are in bijective correspondence with equivalence 
classes of globally invariant cx-subalgebras under the relation of equality mod- 
ulo negligible sets. A treatment of these classical issues may be found, for 
example, in Chapter 2 of Glasner [|Gla03ll . Given a globally invariant a- 
subaglebra in X, a choice of factor tt : X — > Y generating that cr-subalgebra 
will be referred to as coordinatizing the a-subalgebra. 

More generally, the factor map vr : {X, E, fi, T) — {Y, $, z/, S) contains 
ip : (X,S,/i,r) ^ {Z,<iJ,9,R) if 7r-i(<l>) D V^-i(^) up to ^u-negligible 
sets. Another standard feature of standard Borel spaces is that this inclusion 
is equivalent to the existence of a factorizing factor map (p : (F, $, z/, S) — > 
(Z, 9, R) with -ip = (p o n /i-a.s., and that a measurable analog of the 
Schroeder-Bernstein Theorem holds: tt ~ ^/^ if and only if a single such (p 
may be chosen that is invertible away from some negligible subsets of the 
domain and target. If vr contains we shall write % ip or -ip ^ tt . 

If 7r : X — Y and 'ip : X — > Z are any two factor maps as above 
(not necessarily ordered), then the a-subalgebra 7r^^($) V is another 

factor of X. In general we will write vr V for an arbitrary choice of factor 
map coordinatizing this factor, and similarly for larger collections of factor 
maps. 

Dual to the idea of a factor is that of an extension: if X is a F-system, then 
an extension X is another F-system X together with a factor map tt : X — > 
X. 

More general than the notion of a factor is that of a joining: if Xi, X2, . . . , 
Xfc are F-systems then a joining of them is another F-system X together with 
factor maps tt^ : X — > Xj such that these ttj together generate the whole 
(T-algebra of X. Since their introduction by Furstenberg in [|Fur67ll . joinings 
have become one of the most important concepts in the ergodic theorist's 
vocabulary, as is well-demonstrated in Glasner' s book UGlaOBL 
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Partially invariant factors 

Given a T-system X = {X, E,i^,T), the cr-algebra of sets A e E for 
which ijl{AAT'^ (A)) — for all 7 e F is T-invariant, so defines a factor 
of X. More generally, if F is a group and A < F then we can consider the 
(T-algebra H'^^^ generated by all T '^'^-invariant sets: we refer to this as the 
A-partially invariant factor. Note that in this case the condition that A be 
normal is needed for this to be a globally T-invariant factor. Similarly, if 
5" C F and A is the normal subgroup generated by S, we will sometimes 
write E^f^ for S^^a. 

If moreover F is Abelian and Ti and T2 are two commuting actions of F 
on (X, S, fi), then we can define a third action TiTg"^ by setting (TiT2'^)"' := 
T1T2 . Given this we often write S^^"^^ in place of S^i and similarly 
for a larger number of actions of the same group. 

Relative independence 

If Sj > Ei are factors of (X, S, /x, T) for each i < d, then the tuple of factors 
(El, E2, . . . , Erf) is relatively independent over the tuple (Si, S2, . . . , E^) if 
whenever /j e L°°{lJ) is Ej-measurable for each i < dwe have 

The information that various joint distributions are relatively independent will 
repeatedly prove pivotal in the following. Sometimes for brevity we will 
write that 'Ei is relatively independent from E2, E^, E,^ over Si' if 

(El, E2, . . . , E(f) is relatively independent over (Si, E2, . . . , E^^). 

In case F is a group (not just a semigroup, so each T'^ is invertible) we can 
construct examples of this situation as follows. Suppose that Y = (F, $, i/, S) 
is a F-system and 

TTj : Xj = (Xj, Si, Hi, Ti) — > Y 

are extensions of it for i = 1, 2, . . . , A;. Then the relatively independent 
product of the systems Xj over their factor maps tTj is the system 

H X,= ( H Xi, (g) E,, (g) ^,i,T^x■■■xTk^ 

{7ri=...=7rfc} {7ri=...=7rfc} {7ri=...=7rfc} {7ri=...=7rfc} 
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where 

JJ Xi := {(xi, . . . , Xjfc) e X • • • X Xjfc : 7ri(xi) = . . . = 7rfe(xfe)}, 

{7ri=...=7rfc} 

®{t,^= =:;rfc} ^» ^'^^ restriction of Ei (g) • • • (g) to this subset of Xi x • • • x 
Xk, and 




with ?/ 1—^ /ij y an arbitrary choice of disintegration of /Xj over vTj. A quick 
check shows that the factors generated by the coordinate projections (f)j : 
n{7ri= =7rfc} — ^ -^j relatively independent over the common further 
factor map 

TTi O 01 ~ . . . ~ TTfc O 0fc : Xj ^ Y. 

{7ri=...=7rfc} 

In case A; = 2 we write the relatively independent product more simply as 
Xi X{^^=7r2} and in addition if Xi = X2 = X and tti = 7r2 = tt then we 
will abbreviate this further to X x^r X, and similarly for the individual spaces 
and measures. 

The need for the invertibility of T in this construction arises in checking 
that (8){7ri= =7rfe} l^i invariant under the product action. For example, if 
k — 2 then the invariance of Hi under Tj implies that for each 7 e F the 
disintegrations Hi^y satisfy 

However, to argue from here to the invariance of fii /i2 we must know 

in addition that for //-almost every y eY there is a unique point ^ (y) eY 
such that {Tl^)#l^i s-y-^{y) supported on the fibre over y. Given this and 
the essential uniqueness of disintegrations, the above equation implies that 
{T^)^lii^y — Hi,s-i{y) for i/-almost every y, from which it also follows that 

(Tl X T2)l{fli,y (8) //2,2/) = (/^l,57(j/) <8) /^2,57(j/)) 

i/-almost surely, so that integrating again with respect to y gives the desired 
invariance of ni ^^^-^^^^y i_i2- However, this latter argument is valid only if 
we can obtain the above equality pointwise in y, and this can fail if is not 
invertible. 
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Inverse limits 



An inverse sequence of F-systems is a family of F-systems (X^, Sm, f^m, Tm) 
together with factor maps 

Tm) — > {Xk, Sfc, fik, Tk) for all m > A; 

satisfying the compatibility property that o ipj^ = whenever m > k > 
i. From such a family one can construct an inverse limit 

hm ((X^, (X,E,/i,T) 
together with a sequence of factor maps 

'4^m '■ {X , Y] , fjj , T) > (Xm, 

such that o 1/)^ = ijjk whenever m > k, and such that the lifted factors 
'ipm^i^m) together generate the whole of E. Moreover, subject to these stipu- 
lations this inverse limit is unique up to isomorphisms that intertwine all the 
factor maps ipm- This construction is described, for example, in Section 6.3 
of Glasner[|GMl. 

2.2 Idempotent classes 

In much of the following we will be concerned with properties of one system 
that are defined relative to some other class of systems. 

Definition 2.2.1 (Idempotent class). A subclass C of T -systems is idempotent 

if it contains the trivial system and is closed under measure-theoretic isomor- 
phism, inverse limits and joinings. 

Note that our 'classes' need not be sets in the sense of ZFC. In all sub- 
sequent constructions involving these classes it will be clear that we need 
only some set-indexed family of members, and so we will not generally pass 
comment on this set-theoretic distinction. Alternatively, we could circum- 
vent this issue altogether by working only with probability-preserving sys- 
tems modelled by some Borel transformations and invariant probability mea- 
sure on, say, the Cantor space, since any standard Borel system admits such a 
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model up to measure-theoretic isomorphism (see, for instance, Theorem 2.15 
in [|Gla03]| ). 

Examples Suppose that F is a group and that A < F. Then the class of all F- 
systems for which the subaction of A is trivial is easily seen to be idempotent. 
This important example will usually be denoted by Zq in the following. 

More generally, for A as above and any n G N we let denote the class 
of systems on which the A-subaction is a distal tower of height at most n, in 
the sense of direct integrals of compact homogeneous space data introduced 
in HAuscl to allow for the case of non-ergodic systems. Standard results on 
the possible joinings and inverse limits of isometric extensions show that this 
class is idempotent (see HAuscl lAusdH ). Those arguments also allow us to 
identify certain natural idempotent subclasses of Z^, such as the class Z^^^ 
of those systems with A-subaction a distal tower of height at most n and in 
which each isometric extension is Abelian. < 

Lemma 2.2.2. IfC is an idempotent class of T -systems then any V -system X 
has an essentially unique maximal factor in the class C. 

Proof It is clear that under the above assumption the family of factors 

{S < S : S is generated by a factor map to a system in C} 

is nonempty (it contains {0, X}, which corresponds to the trivial system), up- 
wards directed (because C is closed under joinings) and closed under taking 
cr-algebra completions of increasing unions (because C is closed under inverse 
limits). There is therefore a maximal a-subalgebra in this family. □ 

Definition 2.2.3. If C is an idempotent class then X is a C-system z/ X G C, 

and for any X we write Cc" '■ X — > CX/or an arbitrarily-chosen coordina- 
tization of its maximal C-f actor given by the above lemma. 

It is clear that if n : X. — > Y then Cc^ ° '^'^d so there is an 

essentially unique factorizing map, which we denote by Ctt, that makes the 
following diagram commute: 
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X 




CX Y 
CY. 

In addition, we shall abbreviate X x X to X x c X, and similarly for the 
individual spaces and measures defining this relatively independent product. 

The above lemma and definition explain the choice of the term 'idempo- 
tent', which is motivated by a more categorial viewpoint of such subclasses: 
if we identify such a class C with a full subcategory of the category of F- 
systems with factor maps as morphisms, then the assignments X i— t- CX, 
7r i-T- Ctt define an autofunctor of this category which is idempotent. 

The name we give for our next definition is also motivated by this rela- 
tionship with functors. 

Definition 2.2.4 (Order continuity). A class of V -systems C is order contin- 
uous if whenever (Xm)m>o. (^/'r)m>fc>o is an inverse sequence ofT-systems 
with inverse limit X, {iprn)m>o we have 

m>0 

that is, the maximal C-factor of the inverse limit is simply given by the (in- 
creasing) join of the maximal C-factors of the contributing systems. 

Example Although all the idempotent classes that will matter to us later can 
be shown to be order continuous, it may be instructive to exhibit one that is 
not. In case T is an Abelian group, let us say that a system X has a finite- 
dimensional Kronecker factor if its Kronecker factor (f^ : X — y can 
be coordinatized as a direct integral (se Section 3 of HAuscll ) of rotations on 
some measurably-varying compact Abelian groups all of which can be iso- 
morphically embedded into a fibre repository for some fixed D G N (this 
includes the possibility that the Kronecker factor is finite or trivial). It is now 
easy to check that the class of Z-systems comprising all those that are either 
themselves finite-dimensional Kronecker systems, or have a Kronecker fac- 
tor that is not finite-dimensional (so we exclude just those systems that have a 
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finite-dimensional Kronecker factor but properly contain it), is idempotent but 
not order continuous, since any infinite-dimensional separable group rotation 
can be identified with an inverse limit of finite-dimensional group rotations. < 

Definition 2.2.5 (Hereditariness). An idempotent class C is hereditary if it is 

also closed under taking factors. 

Definition 2.2.6 (Join). If Ci, C2 are idempotent classes, then the class Ci VC2 
of all joinings of members of Ci and C2 is clearly also idempotent. We call 
Ci V C2 the join o/Ci and C2. 

Lemma 2.2.7 (Join preserves order continuity). If Ci and C2 are both order 
continuous then so is Ci V C2. 

Proof Let (X^)^>o, {i^^)m>k>o be an inverse sequence with inverse limit 
X, (^m)m>o- Then CqvC2 maximal factor of X that is a joining of a 

Ci-factor and a C2-factor (so, in particular, it must be generated by its own 
Ci- and C2-factors), and hence it is equivalent to V Cc^. Therefore any 
/ G L°°{ii) that is V -measurable can be approximated in iv^(/i) by 
some function of the finite-sum form J2p9pA ' 9p,'i with each Qp^i e L'^ilj) 
being Cj -measurable, and now since each Cj is order continuous we may fur- 
ther approximate each gp^i by some hp^i o ip^^ for a large integer m and some 
Cj-measurable hp^i e L°°{iJ,m)- Combining these approximations completes 
the proof. □ 

Examples Of course, we can form the joins of any of our earlier exam- 
ples of idempotent classes: for example, given a group F and subgroups 

Ai, A2, . . . , An < r we can form Zg ^ V Zg V ■ ■ ■ V Zq". This particular ex- 
ample and several others like it will appear frequently throughout the rest of 
this work. Clearly each class Zq is hereditary, but in general joins of several 
such classes are not; we will see this explicitly in the first example of the next 
section. < 

The following terminology will also prove useful. 

Definition 2.2.8 (Joining to an idempotent class; adjoining). If ^ is a system 
and C is an idempotent class then a joining ofX.to Cora C-adjoining of X 

is a joining ofX. and Y for some Y e C. 
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2.3 Sated systems 



The remainder of this dissertation concerns the consequences of one basic 
idea: that by extending a probability-preserving system, it is sometimes pos- 
sible to impose on it some additional structure that makes its behaviour more 
transparent. For our later applications, a notion of 'additional structure' that is 
both useful and obtainable is best summarized by demanding that the system 
does not admit a nontrivial joining to systems drawn from various other spe- 
cial classes. We will soon show that all systems admit extensions for which 
some version of this is true. This idea, although very abstract and very simple, 
will repeatedly prove surprisingly powerful. 

Definition 2.3.1 (Sated system). Given an idempotent class C, a system X is 
C-sated if whenever tt : X = (X, S, fi, T) — y X is an extension, the factor 
maps 71 and on X are relatively independent over Cc" ° ^ ~ ° C?" 
under jl. Phrased more pictorially, the two systems in the middle row of the 
commutative diagram 




CX 



CX 

are relatively independent over their common factor copy of the system CX. 

An inverse sequence is C-sated if it has a cofinal subsequence all of whose 
systems are C-sated. 

Remark Thi s definition has an important precedent in Fur stenberg and Wei s s ' 
notion of a 'pair homomorphism' betwen extensions elaborated in Section 8 
of [IFW96II . < 

Example If X = {U, Borel, Haar , i?^) with U a compact metrizable Abelian 
group, : — > U a dense homomorphism and the corresponding ac- 
tion of by rotations (so R^{z) := z + 0(n)), then Zq'X is coordinatized by 
the quotient homomorphism 



U — > U/(f){Ze,i) 
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and so X is a member of Zg^ V Zq^ if and only if these quotients together 
generate the whole of U, hence if and only if 0(Zei) fl 0(Ze2) = {0}. 

On the other hand, any ergodic action X of 7? by compact group rotations 
can be extended to a member of Zq^ V Zq^. To see this we first note that 
ergodicity is equivalent to the denseness of 0(Z^) in U, and so in particular 
that 0(Zei) + 0(Ze2) = U . It follows that the 'larger' group rotation system 

X = (f/, Borel, Haar, i?^), 

where U := 0(Zei) © 0(Ze2) and the homomorphism : Z^ — > is 
defined by 

0(ei) := (0(ei),O) and ^(ea) := (0, ^(e^)), 
is an extension of X through the factor map 

U — > U : {x,y) ^ X + y. 

Now X clearly satisfies the above condition for membership of Zg^ V Zq^ 

since the quotients by 0(Ze,;) for z = 1, 2 are respectively the second and first 
coordinate projections. It follows that every such X admits a (Zg^ V Zg^)- 
adjoining that generates the whole of X, and which is therefore not relatively 
independent over any proper factor of X, and hence that X itself is (Zg^ VZg^ )- 
sated if and only if it is already in the class Zg^ V Zg^. This reasoning also 
shows that the class Zg^ V Zg^ is not hereditary. 

A little more generally, if X is a totally weakly mixing extension of an 
ergodic action Y of 7? by compact group rotations, then routine arguments 
show that X is (Zg^ V Zg^)-sated if and only if this is true of Y (since a 
totally weakly mixing extension is relatively disjoint from any Zg^ -system, 
and given this the Furstenberg-Zimmer Inverse Theorem implies that the 62- 
invariant factor of any Zg^ -adjoining of X is also relatively independent from 
X over its factor map to Y; see, for instance. Chapters 9 and 10 of [ Gla03ll ). 
Therefore such an X is (Z^j^ V Z^^ )-sated if and only if Y G Z^^ V Z^^ < 

The crucial technical fact that turns satedness into a useful tool is the abil- 
ity to construct sated extensions of arbitrary systems. This can be seen as a 
natural abstraction from Propositions 4.6 of [|Aus091 and 4.3 of HAusbH , and 
appears in its full strength as Theorem 3.11 in IfAusdil . 
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Theorem 2.3.2 (Idempotent classes admit multiply sated extensions). If{Ci)i^i 
is a countable family of idempotent classes then any system Xq admits an ex- 
tension TT : X — > Xo such that 

• X is Ci-satedfor every i e I; 

• the factors tt and Cq generate the whole ofX.. 

We shall prove this result after a preliminary lemma. 

Lemma 2.3.3. IfC is an idempotent class then the inverse limit of any C-sated 
inverse sequence is C-sated. 

Proof By passing to a subsequence if necessary, it suffices to suppose that 

(Xm)„i>o, (V'r)m>fe>o is an inverse sequence of C-sated systems with inverse 
limit Xoo, {i'rn)m>i, and let tt : X — > Xoo be any further extension and 
/ G L°°(/ioo). We will commit the abuse of identifying such a function with 
its lift to any given extension when the extension in question is obvious. With 
this in mind, we need to show that 

E(/|C?) = E(/|C?-). 

However, by the C-satedness of each X^, we certainly have 

E(E(/|V^J|C?) = E(/|C?-), 

and now as m — > oo this equation converges in L'^{p) to 

e(/|C?) = e(/| V (Cf-oV'j)- 

m>l 

By monotonicity we have 

m>l 

and so by sandwiching the desired equality of conditional expectations must 
also hold. □ 
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Proof of Theorem 12.3.21 We first prove this for I a singleton, and then in 
the general case. 

Step 1 Suppose that / = {i} and Q = C. This case will follow from a 
simple 'energy increment' argument. 

Let (/r)r>i be a countable subset of the L°°-unit ball {/ G L°°{fJ') '■ 
ll/lloo < 1} that is dense in this ball for the L^-norm, and let (rj)j>i be a 
member of in which every non-negative integer appears infinitely often. 

We will construct an inverse sequence (Xm)m>05 i4'k')m>k>o starting from 
Xo such that each X^+i is a C-adjoining of X^. Suppose that for some mi > 
we have already obtained (Xm)^Lo, (V^™)mi>m>fc>o such that id^^^ ^ 

X 

'"^ V tpo"^- We consider two separate cases: 

• If there is some further extension tt : X — X^^ such that 

then choose a particular tt : X — > X^i such that the increase 

l|E,(/.,„, o o vr I Cf )||^ - ||E„„^ (/,,„^ o I C?-)||2 

is at least half its supremal possible value over all extensions. By re- 
stricting to the possibly smaller subextension of X — > X^^ generated 
by TT and we may assume that X is itself a C-adjoining of X^i and 
hence of Xq, and now we let X^i+i := X and V^™^^^ := tt (the other 
connecting factor maps being determined by this one). 

• If, on the other hand, for every further extension tt : X — X^i vve 
have 

l|E,(/.,„, o o vr I C?)||^ < ||E,^^ if,^^ o I C?-)||^ + 2-- 
then we simply set X„^+i := X„^ and V^l^^ := idx„^ • 
Finally, let Xqo, (^m)m>o be the inverse limit of this sequence. We have 

m>0 m>0 

m>0 m>0 
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so Xoo is still a C-adjoining of Xq. To show that it is C-sated, let tt : X — > 
Xoo be any further extension, and suppose that / e L°°{fj,^). We will com- 
plete the proof for Step 1 by showing that 

EA(/°^lC?) = E^^(/|C?-)o7r. 

Since Xqo is a C-adjoining of X, this / may be approximated arbitrarily 
well in L^(/ioo) by finite sums of the form J2p9p ' with gp being bounded 
and Cc^°° -measurable and hp being bounded and V'o -measurable, and now by 
density we may also restrict to using hp that are each a scalar multiple of 
some frpOfpo, so by continuity and multilinearity it suffices to prove the above 
equality for just one such product g ■ {fr o ipo)- Since g is -measurable, 
this requirement now reduces to 

Since ^ Cc^°° o tt, this will follow if we only show that 

||E^(/. o^oon I C^)\\l = \\E,Mr o ^0 I C?-)||^. 
Now, by the martingale convergence theorem we have 

as m — > oo. It follows that if 

||E/,(/. O V'O O TT I C?)||^ > \\E,Mr O VO I C?-)||2 

then for some sufficiently large m we would have = r (since each integer 
appears infinitely often as some r^) but also 

l|E,.,.(/. o i^r' I C^"^)ll2 - l|E,.(/. o C I Cl-)\\l 
< 1 (ll E,{fr o^^;,on\ C?)||^ - ||E,„(/ o \ C^-^Wl) 

and 

||E/i(/. o o TT I C?)||^ > ||E,„(/ o I C^-)\\l + 2-, 
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so contradicting our choice of X^+i — > in the first alternative in our 
construction above. This contradiction shows that we must actually have the 
equality of L^-norms required. 

Step 2 The general case follows easily from Step 1 and a second inverse 
limit construction: choose a sequence (im)m>i G in which each mem- 
ber of / appears infinitely often, and form an inverse sequence (Xm)m>o> 
(V'r)m>fc>o starting from Xq such that each X^ is Cj^-sated for m > 1. The 
inverse limit X is now sated for every Q, by Lemma |2.3. 31 □ 

Remark Thierry de la Rue has shown me another proof of Theorem 12.3.21 
in case F is a group that follows very quickly from ideas contained in his 
paper IILRR03II with Lesigne and Rittaud, and which has now received a nice 
separate writeup in URueH . The key observation is that 

An idempotent class C is hereditary if and only if every system is C-sated. 

This in turn follows from a striking result of Lemahczyk, Parreau and Thou- 
venot HLPTOOH that if two systems X and Y are not disjoint then X shares a 
nontrivial factor with the infinite Cartesian power Y^°°. Given now an idem- 
potent class C and a system X, let C* be the hereditary idempotent class of all 
factors of members of C, and let Y be any C-system admitting a factor map 
TT : Y — > C*X (such exists because by definition C*X is a factor of some 
C-system). Now forming X := X X|^x Y (so here is where we need T to 

be a group), a quick check using the above fact shows that CX = C*X, and 
that this is equivalent to the C-satedness of X. < 
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Chapter 3 



The convergence of 
nonconventional averages 

In this chapter Theorem C will be deduced from Theorem l2.3.2[ This amounts 
to a rather simpler outing for many of the same ideas that will go into proving 
recurrence in the next chapter. 

We first recall the Hilbert space version of a classical estimate due to van 
der Corput, which has long been a workhorse of Ergodic Ramsey Theory. 
After giving this its own section, the Furstenberg self-joining for a tuple of 
transformations is introduced, and then in the last section we show how the 
right instance of satedness implies that these enjoy some additional structure 
from which a proof of Theorem C follows quite quickly. 

Notation 

Before commencing with any of these proofs, we make a slight modification 
to the notation of the Introduction to be more in keeping with that of Chap- 
ter [2l rather than letting Ti, T2, . . . , T^^ denote a tuple of commuting individual 
transformations on (X, S, yu), we henceforth regard these as the subactions of 
the basis vectors ei, 62, . . . , for a single Z'^-action T. Theorem C is ac- 
cordingly re-phrased as asserting that the averages 

1 ^ 

n=l 
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converge in for any Z'^-system (X, E,/i,T). This slight increase in 

abstraction will prove worth tolerating when we come to various constructions 
of new actions from old during our later arguments, in which we will need to 
keep efficient track of how the action of one vector in may have been re- 
assigned to that of another. It follows that in the remainder of this work, a 
list such as 'Ti, T2, . . . , T^^' will denote a tuple of whole actions of some 
previously-decided group, rather than individual transformations. 



3.1 The van der Corput estimate 

This result and a related discussion can be found, for example, as Theorem 

2.2 of Bergelson IIBer96L 

Proposition 3.1.1 (Van der Corput estimate). Suppose that {un)n>i is a bounded 
sequence in a Hilbert space S^. If the vector-valued averages 

N 



n=l 



do not converge to in norm as N — > 00, then also the scalar-valued aver- 
ages 



M ^ N 

m=l n=l 



do not converge to as N — > 00 and then M — > 00. 
Proof For any fixed if > 1 we have 



N 



N 



H 



N 



n=l 



N ^ H 

n=l h=l 



as — > 00, where the notation vjy denotes that wjy — vn — > 

in Sj. However, the squared norm of the right-hand double average may be 
estimated by 



I 1 ^ 1 ^ 

n=l h=l 



1 ^ M 1 ^ 



n=l 



h=l 
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(using the triangle and Cauchy-Schwartz inequalities), and this right-hand 
side is equal to 

1 1 

h\,h2=l n=l 

It follows that these averages must also not converge to as — > oo and 
then H — > oo; but for large H these can be expressed as averages of the 
averages 

m=l n=l 

for correspondingly large values of M, and so these also cannot converge to 
as A?" — > oo and then M — > oo, as required. □ 



3.2 The Furstenberg self -joining 

Theorem C is proved by induction on d. In the first instance, this induction 
is enabled by a construction that is made possible once convergence is known 
for a smaller number of transformations, and which will also be central to the 
proof of Theorem A in the next chapter. 

Thus, suppose now that for some d > 1 the convergence of Theorem C 
is known for all tuples of at most d — 1 commuting transformations (so this 
assumption is vacuous if c? = 1). Let X = (X, E, /x, T) be a Z^-system, 
and let Ai, A2, . . . , Ad E S. By integrating and using the invariance of fj, 
under T^^, our assumption applied to the transformations T^^-ei^ ^ j-ed-ei 
implies that the scalar averages 

1 ^ 

— n T-'"'^{A2) n • • • n T-'"'''{Ad)) 

n=l 

n=l 

converge as — > 00. Moreover, the limit takes the form fi^{Ai x A2 x 
• • • X ^4^) for some probability jJ' on X'^ that is invariant under the diagonal 
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Z'^-action defined by (T^^)" := T" x T" x • • • x T", simply because it is a 
limit of averages of the off-diagonal joinings 

5(T"-i(a;),T"'=2(x),...,T"<=d(i)) l^{dx) for n G N. 

The Z'^'-system (X'^, S®'^, /x^, T^"^) is therefore a d-fold self-joining 

of X through the d coordinate projections tt^ : X^ — > X. We refer to either 
/i^ or X^ as the Furstenberg self-joining of X. Given functions fi, fi, 
fd G L'^{ii), by approximating each of them in L°° using step functions we 
may extend the above definition of jdF to the convergence 

as — )■ oo. 

In addition to its invariance under T^'^, the definition of jjF gives an addi- 
tional invariance that will shortly prove crucial. 

Lemma 3.2.1. Provided the limiting self-joining fi^ exists, it is also invariant 
under the transformation T^^ x x • • • x T^<*. 

Proof For any Ai, A2, . . . , G S we have 

^F^^ei X X . . . X T'''')-\Ai X A2 X • • • X Ad)) 

1 ^ 

n — yoo iV ' 
n=l 

= lim V;,(T-'^-i(^i)n---nT— 

n — ^00 iV ■^-^ 

n=2 

= /X^(Ai X ^2 X ••• X Arf), 

where the last equality follows because the discrete intervals {1,2,..., A^} 
and {2,3,...,iV+l} asymptotically overlap in 1 — o(l) of their lengths. □ 

It will be important to know that Furstenberg self-joinings behave well 
under inverse limits. The following is another immediate consequence of the 
definition, and we omit the proof. 
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Lemma 3.2.2. If (X.m)m>o, {4'k')m>k>o is inverse sequence with inverse 
limit X, {jpm)ra>o, then the Furstenberg self-joinings X.f^form an inverse se- 
quence under the factor maps {ip^)^'^ with inverse limit X^, {'ip^)m>Q- D 



3.3 The proof of convergence 

The final observation needed before we prove Theorem C is that satedness 
implies a certain inverse result for the situation in which the functional aver- 
ages 

SnUi. /2, . . . , h) := ^ o ■ (/2 o T-^) ifdo T-'^) 

n=l 

do not converge to 0. 

Proposition 3.3.1. Suppose that X is C-satedfor the idempotent class 

i=2 

and that fi G L°°{fJ') for i = 1,2, ... ,d. In addition, let ^ := S^*"^ V 
\/^^2 E-^"^"-^ \ so this is a factor of X.. If 

SMlj2,...,fd)7^0 

as N — > oo, then also E(/i | $) 7^ 0. 

Remark In the terminology of [IFW96I . which has since become standard 
in this area (and is roughly followed in [|Aus09l ). this asserts that for a C-sated 
system X the factor $ is partially characteristic. < 

Proof This rests on an appeal to the van der Corput estimate followed by a 
re-interpretation of what it tells us. Letting n„ := (/i o T"^^) ■ (/2 o T"'^^) ■ 
■ ■ ■ ■ (/d ° T"'''*), Proposition 13.1.11 and our assumption imply that the double 
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averages 



M ^ N 

m=l n=l 

M , Af 



m=l n=l 

do not tend to as — )■ 00 and then M — 00. However, simply by re- 
arranging the individual functions and recalling the definition of /i^, the limit 
in N behaves as 

1 ^ 1 ^ /■ — — 

m=l n=l "^'''^ 
M „ 

E / (A ■ (7^ ° ^™'')) ® • ■ ■ ® (A ■ (A ° T^""')) d/^^ 



M 



1 /• 

ni=l 



/d) ■ (/i ® ■ ■ ■ ® o (T-i X ■ ■ ■ X T-<*)-) d/i^ . 



Now, since Lemma [3.2.11 gives that ji^ is invariant under T^^ x T''^ x ■ ■ ■ x 
T^'^, the classical mean ergodic theorem allows us to take the limit in M to 
obtain 

/ (/i ® ■ ■ ■ ® /.) ■ (A®---®/, I {T.^'^f^^-->^T^^) d/. 

Thus the van der Corput estimate tells us that this integral is non-zero. The 
proof is completed simply by re-phrasing this conclusion slightly. We have 
previously used fi^ to define a Z'^-system X^, but in light of Lemma 13 . 2 . 1 1 we 
may alternatively use it to define a Z'^-system X by setting 

f^i := T'^i X T'^^ X ■ ■ ■ X T""^ 
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and 

2^e. _ (^yxd^^e. fori = 2,3,..., d 

(thus, the basis direction ei is treated differently from the others). With this 
definition the first coordinate projection tti : X'^ — > X still defines a factor 
map of Z'^-systems X — > X, because T" does agree with T" on the first 
coordinate in X'^ for every n. On the other hand, for i = 2,3, ... ,d the 
function fiom E L°°{jj,^) depends only on the i^^ coordinate in X'^, and 
on this coordinate the transformations T'^^ and T'^^ agree, so that /j o m is 
T''*^''! -invariant. Thus the nonvanishing 

/ (/i ® ■ ■ ■ ® /d) ■ E^P (/i ® ■ ■ ■ ® 1 1^"' ) d/xV 

asserts that the lifted function /i o tti has a nontrivial inner product with a 
function that is a pointwise product of S^''^"^''' -measurable functions for 
i = 2,3, ... ,d and the function E^f (/i (8) ■ ■ ■ ® /d | S-^*"^) , which is mani- 
festly S^''^ -measurable. Therefore /i o tti has a nontrivial conditional ex- 
pectation onto S-^''^ V Vj=2^^''^"^ ' which is the cr-algebra generated by 
the factor map X — > CX. On the other hand, by C-satedness fi o m must 
be relatively independent from this cr-algebra over $, and so we also have 
E^(/i I $) 7^ 0, as required. □ 

Proof of Theorem C This proceeds by induction on d. The case = 1 is 
the classical mean ergodic theorem, so suppose now that d > 2, that we know 
the result for all tuples of at most d — 1 transformations and that we are given 
T-.Z'^ r\ (X,S,/i). 

Let C be the class in Proposition 13. 3. II By Theorem 12 . 3 . 2 1 we may choose 
a C-sated extension vr : X — > X, and now since the corresponding inclusion 
L'^{lA — is embedding of algebras that preserves the norms || ■ ||2 it 

will suffice to prove convergence for the analogs of the averages Sn associated 
to X. To lighten notation we henceforth assume that X itself is C-sated. 

Suppose that /i,/2,...,/,+i G L-(/i). Letting $ := S^^WV'=2 ' 
we see that the function /i — E(/i | $) has zero conditional expectation onto 
$, and so by the multilinearity of Sn and Proposition 13 . 3 . 1 1 we have that 

^iv(/l,/2,...,/d)-5iv(E(/l|<l>),/2,...,/.) 

= ^;v(/i-E(/i|$),/2,...,/rf)^0 
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in as — ¥ oo. It therefore suffices to prove convergence with /i 

replaced by E(/i | $), or equivalently under the assumption that /i is $- 
measurable. 

However, this implies that /i may be approximated in || ■ ||2 by finite sums 

of the form gp ■ h2,p ■ h^^p hd,p in which each Qp is r*^i -invariant and 

each hj^p is T'^^^'^i -invariant. Since the operator 

/l 5'iv(/i,/2,...,/d) 

is linear and uniformly continuous in -L^(yu) for fixed bounded /2, fs, ■ ■ ■, fd, it 
therefore suffices to prove convergence in case /i is simply one such product, 
say (7/12^3 ■ ■ ■ hd- For this function, however, we can re-arrange our averages 
as 



n=l 

1 ^ 
n=l 

since g o T^'^i = g and hj o T"''! = hj o T"^^ for each j = 2, 3, . . . , d. Now 
the averages appearing on the right are uniformly bounded in || ■ ||oo and in- 
volve only the d — 1 transformations T'^^, T*^^, . . . , T^"^, and so the inductive 
hypothesis gives their convergence in || • II2. Since \\g\\oo < 00 this gives also 
the convergence of the left-hand averages in || ■ II2, as required. □ 

Remark In fact the above proof gives a slight strengthening of Theorem C, 
in that the convergence is uniform in the location of the interval of averaging: 
that is, the averages 



'^1 • 1 

'2/ 



converge in L (fi) for any sequence of increasingly long finite intervals In C 
Z, and the limit does not depend on the choice of these intervals. This result 
is treated in full in flAusOQi < 
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Chapter 4 



Multiple recurrence for 
commuting transformations 

In this chapter we deduce Theorem A from Theorem I2.3.2[ Coupled with 
Furstenberg and Katznelson's correspondence principle from [ FKYSH . this 
gives a new proof of the Multidimensional Szemeredi Theorem, but we will 
not recount that correspondence here since it is already well-known from that 
paper and several subsequent accounts, such as those in the books HFurSlll of 
Furstenberg and IITV06II of Tao and Vu. 

After introducing a more convenient reformulation of Theorem A be- 
low, we first introduce a very general meta-question that covers most of the 
ergodic-theory we need. We then show how it specializes to give quite de- 
tailed information on the Furstenberg self -joining corresponding to a tuple of 
commuting transformations. From this the proof of Theorem A follows by 
appealing to a version of Tao's infinitary hypergraph removal lemma. 

We will continue the practice begun in the previous chapter of writing 
a tuple of commuting transformations as T^^ , T'^'^ , . . . , T^"^ for some Z"^- 
action T. The convergence result of the previous chapter implies that for 
any such T^^, T^^, T'^'^ the Furstenberg self -joining /i^ of Section 
exists. Knowing this. Theorem A about the limit infima of scalar averages is 
a consequence of the following more general result: 

Theorem 4.0.2. IfT:Z'^r\ (X, /i^ denotes the Furstenberg self- 

joining of the transformations T'^S T'^^, T'^'^ and Ai, A2, . . . , G S 
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then 



i/{Ai X A2 X ■ ■ ■ X Ad) = =^ /i(y4i n ^2 n ■ ■ ■ n Arf) = 0. 

Inded, in case Ai = A for each A this assertion is precisely the contrapos- 
itive of Theorem A. However, the formulation of Theorem l4.0.2l has the great 
advantage of allowing us to manipulate the sets A^ separately in setting up a 
proof by induction. 

4.1 The question in the background 

Having reformulated our goal in this chapter as Theorem 14.0.21 it becomes 
clear that it is really an assertion about the joint distribution of the coordinate 
projections tTj : X'^ — > X,i = 1,2, . . . ,d under /i^. 

By Lemma [3.2.1l /i^ is an invariant measure for the action T of the larger 
group Z^^^^ defined by setting 

Thus this defines a Z'^+^ -system X in which the Furstenberg self -joining 
corresponds to the subaction of Z"' © {0}. The key to our proof is the ob- 
servation that the coordinate projections vTj now define factor maps of X 
onto a collection of Z'^+^-systems Xi, X2, . . . , X^ for each of which some 
one-dimensional subgroup of Z'^+^ acts trivially: specifically, this is so with 
Xj = (Xj, Sj, yUj, Tj) defined simply by 'doubling up' the Zej-subaction of 
T: 

(X„E„/i,):=(X,S,/x), t/^'®^°>:=T and T^'^':=T^'. 
It follows immediately from these specifications that rCioT = TiOHi and that 

-y^d+l—^i 

Having made these observations, our principal results on /i^ will fall within 
the pattern of the following: 

Meta-question: 

Given subgroups Fi, F2, . . . , F^ < Z^ and Z^-systems {Xi, Ej, /ij, Tj) 
for i = 1,2, ... ,r such that ' = id, what do these partial in- 
variances imply about the possible joinings of these Z^-systems? 
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The first stage in proving Theorem 14.0.21 will boil down to a handful of 
special cases of this question. In this section we show that a partial answer 
covering all of the cases we need can be given quite easily, subject to an 
algebraic constraint on the subgroups and an allowance to pass to extended 
systems. 

First, it is instructive to understand the simple case r = 2: 

Lemma 4.1.1. If the systems Xj are Vi-partially invariant for i = 1,2, then 
any joining of them is relatively independent over their factors ' ■ 

Proof Suppose tTj : (F, $, S) — )■ (Xj, Sj, /ij, Tj) is a joining of the two 
systems and consider subsets Ai E Sj. In addition let {Fn)n>i be a F0lner 
sequence of subsets of Fi. Then the invariance of u and the Mean Ergodic 
Theorem give 

lim — — V / {1a, o 7ri)(lA2 o o tts) du 



= lim / {1a, o TTi) V 1a^ o T^) o vTa) dz/ 

= j^{lA, o 7ri)(E^,(A2 I o vra) dz/. 

Since ^ = id the factor ^ consists of sets that are invariant under the 
whole group Fi + F2, and hence agrees with S2 ^ • Arguing similarly 
with the roles of Xi and X2 reversed, this shows that that above is equal to 

as required. □ 

For r > 3 we will not obtain an answer as complete as the above. How- 
ever, a natural generalization is available for certain special tuples of sub- 
groups, subject to the further provision that we may replace the originally- 
given systems Xj with some extensions of them. The extensions, of course, 
will be sated extensions, and for them the picture is given by the following. 
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Theorem 4.1.2. Suppose that 

^ Ti e © • • • © © A 

is a direct sum decomposition oflP into the subgroups Fj and some auxiliary 
subgroup A, and that Xj e Zq* for i — 1,2, ... ,r are systems such that each 
Xj is Qi-sated for 

Q:= V ^l^""'- 

Then for any joining tTj : Y — y Xj, i = 1,2, ... ,r, the factors 7rj~^(Sj) are 
relatively independent over their further factors 

Proof This is a simple appeal to the definition of satedness. We will show 
that vrf ^(Si) is relatively independent from V^=2 ^J^C^j) ^^^^ 
TTf ^ ( Vj=2 ^^^^^^^^'^ ^ the cases of the other factors being similar. 

Let r := © ■ ■ ■ © © A < Z^, so this complements Ti in Z^, and let 
Y = (Y, $, v, S). From S we may construct a new z/-preserving Z-^-action 
S' by defining 

^S')^+n ._ f^j. m e Fi, n e F. 

Let Y' := {Y,^,u,S'), so manifestly Y e Zq^ Similarly define the 

systems X- = (Xi, Ej, Hi, TO for i = 2,3, ... ,r, so these also have trivial 
Fi-subactions and hence in fact lie in the classes Zq^"*"^'. Since = idxi 
for all m e Fi by assumption, we see that tti o 5" = Ti o tti, so tti still defines 
a factor map Y' — > Xi. On the other hand, we also have 

whenever i — 2,3, . . . ,r and m e Fi, n G Fj and p G 0j^i , F^ © A. 

Therefore tTj is a factor map Y' — y X- for i = 2,3,. . . ,d, and so Y' is 

r I 'p 

a joining of Xi with members of the classes Zq^ ' for i = 2, 3, . . . , r: that 
is, Y is a Ci-adjoining of Xi. By the assumption of Ci-satedness, it follows 
that this adjoining is relatively independent over the maximal Ci-factor of Xi, 
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which equals Vj=2 ^ , as required. □ 

Example Without the assumption of satedness, more complicated phenom- 
ena can appear in the joint distribution of three partially-invariant systems. 
For example, let {X, E, /i, T) be the Z^-system on the two-torus with its 
Borel cr-algebra and Haar measure defined by T'^^ := R(a,o), := -R(o,a) 
and T*^^ := R(a,a), where Rq denotes the rotation of by an element q eT^ 
and we choose a eT irrational. In this case we have natural coordinatizations 
of the partially invariant factors (q''' : X — > T given by 

Criti,t2)=t2, Criti,t2)=h and CPiti,t2) = ti-h. 

It follows that in this example any two of S^''^ , S^''^ and S^'''' are indepen- 
dent, but also that any two of them generate the whole system (and so overall 
independence fails). 

In fact, it is possible to give a fairly complete answer to our meta-question 
in the case of any three Z-subactions of some Z^-action, without the sim- 
plifying power of extending our systems. However, that answer in general 
requires the handling of extensions of non-ergodic systems by measurably- 
varying compact homogeneous space data: it is contained in Theorem 1.1 
of HAuscL in which such extensions are studied in suitable generality. The 
full formulation of that Theorem 1.1 is rather lengthy, and will not be re- 
peated here; and it seems clear that matters will only become more convoluted 
for larger r. < 

Theorem 14. 1 .21 already suffices for the coming applications, but it is nat- 
ural to ask about more general collections of subgroups Fj < Z^. In fact it 
is possible to do slightly better than Theorem 14 . 1 . 2 1 with just a little extra ef- 
fort: the same conclusion holds given only that these subgroups are linearly 
independent, in the sense that for any rij E Fj we have 

111 + n2 + ■ ■ ■ + rir = =^ rij = Vz < r. 

Indeed, given this linear independence, one can let A := F1+F2 + . . .H-F^and 
now argue as in the above proof to deduce that the conclusion holds provided 
that Xi is Ci-sated among all A-systems. However, it is not quite obvious that 
this is the same as being Ci-sated among Z^-systems. This turns out to be 
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true, but it requires the key additional result that whenever A < A are discrete 
Abelian groups, X is a A-system and ol : Y — 'Y^^ is an extension of the 
A-subaction, there is an extension of A-systems /3 : Z — )■ X that fits into a 
commutative diagram 

zrA ^ ^xr^ 




Y 



The elementary but slightly messy proof of this can be found in Subsection 
3.2 of [lAusdl . 

What happens when there are linearly dependences among the subgroups 
Fi, . . . , r^? An answer to this question could have several applications 
to understanding multiple recurrence, but it is also clearly of broader interest 
in ergodic theory. At present the picture remains unclear, but a number of 
recent works have provided answers in several further special cases, and in 
moments of optimism it now seems possible that a quite general extension 
of Theorem 14.1.21 (using satedness relative to a much larger hst of classes 
of system) may be available. A more precise conjecture in this vein will be 
formulated in Chapter [6l 

Remark Before leaving this section, it is worth contrasting the feature seen 
above that linear independence is helpful with previous works in this area. In 
the early study of special cases of Theorems B or C it was generally found that 
the analysis of powers of a single transformation (or correspondingly of arith- 
metic progressions in Z) revealed more usable structure and was thus more 
tractable than the general case. Of course, Furstenberg's original Multiple 
Recurrence Theorem preceded Theorem B; and the conclusion of Theorem 
C was known in many such 'one-dimensional' cases long before the general 
case was treated (see llOlMlaMlCLSSbllFWQgllHKllZieOTll . although we 
note that Conze and Lesigne did also treat a two-dimensional case of Theorem 
C, and that in [Zha96 | Zhang extended this result to three dimensions subject 
to some additional assumptions). 

The same phenomenon is apparent in the search for finitary, quantitative 
approaches to Szemeredi's Theorem and its relatives. Indeed, a purely finitary 
proof of the Multidimensional Szemeredi Theorem appeared only recently 
in works of Rodl and Skokan [.RS04J . Nagle, Rodl and Schacht [.NRSO6II 
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and Gowers HGowOVH . building on the development by those authors of suf- 
ficiently powerful hypergraph variants of Szemeredi's Regularity Lemma in 
graph theory. Furthermore, the known bounds for how large Nq must be taken 
in terms of 5 and k are far better for Szemeredi's Theorem than for its multidi- 
mensional generalization, owing to the powerful methods developed by Gow- 
ers in [|Gow98[rGow01|| , which extend Roth's proof for /c = 3 from ||Rot53ll 
and are much more efficient than the hypergraph regularity proofs. As yet 
these methods have resisted extension to the multidimensional setting, except 
in one two-dimensional case recently treated by Shkredov UShkOSl . This story 
is discussed in much greater depth in Chapters 10 and 11 of IITV06L 

Running counter to this trend, the value of linear independence for the 
present work is a consequence of our strategy of passing to extensions of 
probability-preserving systems. Although such extensions can lose any a pri- 
ori algebraic structure (such as being a Z^-action in which the transforma- 
tions T'^' are actually all powers of one fixed transformation), the various 
instances of satedness that it allows us to assume will furnish enough power 
to drive all of our subsequent proofs. These instances of satedness will all 
be relative to joins of different classes of partially invariant systems, and, as 
illustrated by the above proof of Theorem l4.1.2[ the usefulness of this kind of 
satedness will rely on the ability to construct new systems for which the cor- 
responding subgroups behave in specified ways. With this in mind it is natural 
that having those subgroups linearly independent removes a potential obstacle 
from these arguments, and that answering our meta-question for sated systems 
will be more difficult when the subgroups exhibit some linear dependences. < 



4.2 More on the Furstenberg self -joining 

We now return to the study of the Furstenberg self -joining introduced in 
the previous chapter, with the goal of deriving a structure theorem for it as 
a consequence of Theorem 14.1.21 in case X is sated with respect to enough 
difference classes. In order to formulate this structure theorem, we first settle 
on some more bespoke notation. 

In the following we shall make repeated reference to certain factors as- 
sembled from the partially invariant factors of our Z'^-action T, so we now 
give these factors their own names. They will be indexed by subsets of 
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[d\ := {1,2,..., d}, or more generally by subfamilies of the collection (1^^) 
of all subsets of [d] of size at least 2. On the whole, these indexing subfamilies 
will be up-sets in (!^^) : X C (1^^) is an up-set ifuel and [d] ^ v D u imply 

V E I. For example, given e C [d] we write (e) := {u E (If^) ■ u ^ e] (note 
the non-standard feature of our notation that e E (e) if and only if |e| > 2): 
up-sets of this form are principal. We will abbreviate {{i}) to (i). It will also 
be helpful to define the depth of a non-empty up-set I to be min{ | e | : e E I}. 

The corresponding factor for e = {ii,i2, ■ ■ ■ ,ik} ^ [d] with > 2 is 
$e := =...=T 'k ^ 1^ ^Yyq partially invariant factor for the {k — 

1) -dimensional subgroup 

Z(ei^ - GiJ + Z(ei^ - eig) H h 2(6^^ - e^J. 

More generally, given a family A C (!^^) we define $^ := Vee^ 

From the ordering among the factors $e it is clear that $i = $^ whenever 
-4. C (1^^) is a family that generates X as an up-set, and in particular that 
$e = $^e> when |e| > 2. 

We now return to the Furstenberg self-joining /i^. For e = {ii < ^2 < 
. . . < C [(i] we write /i^ for the Furstenberg self -joining of the transfor- 
mations T^n,T'=«2, ...,T''»fc: 

1 ^ 

/if (Ai X ■ ■ ■ X A,) := lim - J2 f^iT-""^'' (A^) n ■ ■ ■ H (A,)), 

n=l 

SO this clearly extends the definition of Section [X2l in the sense that /i^j = /i^. 
Of course, we know the existence of each /ig by the results of the previous 
chapter. 

We next record some simple properties of the family of self-joinings fi^ 
for e C [d]. Given subsets e C e' C [d], in the following we write vrg for the 
coordinate projection — > X^, since the choice of e' will always be clear 
from the context. 

Lemma 4.2.1. Ife C e' C [d] then {7[e)#nl> = A^e- 

Proof This is immediate from the definition: if e = {ii < i2 <■■■< ik} 
e' = {ji < j2 < . . . < ji} and Ai. E S for each j < k then 

1 ^ 

{T^ehi/AA.x.-.xAj := lim -5^MT-"^-(%)n---nT--Mi?J) 

N — i>oo iV 

n=l 
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where Bj := Aj if j G e and Bj := X otherwise; but then this last average 
simplifies summand-by-summand directly to 

1 ^ 

lim - V ^(r-"-n (Ai) n n ■ ■ ■ n r-"--^^ (A,)) =: /if (A^ x ■ • • x a^), 

N — >oo iV 

n=l 

as required. □ 



Lemma 4.2.2. For any e C [d] and A & ^f,we have 

^^l{^^r\A)^nJ\A)) = Q V^,j G e : 

thus, the restriction fi^ \^^e is just the diagonal measure (/i t^J^"^. 

Proof If e = {zi < ^2 < • • • < ^fc} and Aj G $e for each j < k then by 
definition we have 



X ^2 X ■ ■ ■ X Afc) 
1 ^ 

lim — V /i(r-"^"i (Ai) n T-"^^2 (A2) n • • ■ n t-"''^^ (Afc)) 



N — s>oo 

n=l 

1 N 

= Ji"^ ^$^M2^-"''M^inA2n---nA,)) 

Af — >oo iV •^"^ 

n=l 

= /i(AinA2n---nAfc), 

as required. □ 

It follows from the last lemma that whenever e C e' the factors vr"^ ($e) < 
S®"' for i G e are all equal up to yU^, -negligible sets. It will prove helpful later 
to have a dedicated notation for these factors. 

Definition 4.2.3 (Oblique copies). For each e C [d] we refer to the common 
nj^-completion of the a-subalgebra n^'^^^e), i E e, as the oblique copy of 
$e, cind denote it by $g. More generally we shall refer to factors formed by 
repeatedly applying fl and V to such oblique copies as oblique factors. 

We are now ready to derive the more nontrivial consequences we need 
from Theorem |4.1.2[ These will appear in two separate propositions. 
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Proposition 4.2.4. For each pair i < diet 

Ifl^ is Ci-sated for each i then the coordinate projections tTj : X'^ — > X are 
relatively independent under ji^ over the further factors 



Proof This follows by applying Theorem 14.1.21 to the Z°'+^-system X in- 
troduced at the beginning of the previous section. Indeed, as explained there 
the coordinate projections tTj : X — > Xj witness that X is a joining of the 
systems X^ e Zo^''*+^-"'\ 
Let 

j<'i,j7^i 

an idempotent class of Z'^^^ -systems. Now the assumption that X is Cj-sated 
as a Z'^-system implies that Xj is Dj-sated as a Z'^+^-system. Indeed, given 
any extension of Z'^+^-systems n : Y — y Xj the subaction (DjY)^(^''®^°^ 
is clearly a member of the class Cj, so the Cj-satedness of X implies that tt 
is relatively independent from (^^^ : Y — > DjY over its further factor map 
Cq, which agrees with (^^^ because the whole of Xj is already Z(ed+i — ej)- 
partially invariant. 

Setting Fj := Z(ei — e^+i) for i = 1,2, . . . , d and A := Ze^+i, these sub- 
groups define a direct-sum decomposition of Z^^^. Therefore Theorem 14. 1.2 1 
applies to tell us that the factors n^'^(T.) are relatively independent under fi^ 
over their further factors 

as required. □ 

For our second application of Theorem |4.1.2| we need a preparatory lemma. 

Lemma 4.2.5. IfCCD are idempotent classes of T -systems for any discrete 
group F and X is C-sated, then DX is also C-sated. 
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Proof If X is C-sated and tt : Y — t- DX is any extension, then the rela- 
tively independent product X := X X|^x^^i. Y is an extension of X through 
the first coordinate projection (it for the sake of using this relatively indepen- 
dent product that we need F to be a group). Therefore by C-satedness the 
factor map (^^ is relatively independent from this coordinate projection over 
the further factor map : X — > CX of the latter, and so the same must be 
true of (q. However, the factor map is clearly contained in the factor map 
Cd since CCD, and so it must actually equal Cc"^ o : X — C(DX). 
Hence tt is relatively independent from (^^ over its further factor map Cc'^^ 
required. □ 

Proposition 4.2.6. For each subset e = {ii, i2, ■ ■ ■ , ik} ^ [d] let 

f~ _ \ I yZ(eij-ei2)H hZ(ei^-ei^)+Z(ei^-ej) 

je[d]\e 

and suppose now that X is Ce-satedfor every e (so this includes the assump- 
tion of the previous proposition when e is a singleton ). Then under the 
oblique factors have the property that $ j and $f / are relatively independent 
over ^xni'fa^ '^^y up'^^^s X, X' C 

Proof Step 1 First observe that the result is trivial if X 3 X', so now 
suppose that X' = (e) where e is a maximal member of (!^^) \ X. Let 
{ai, a2, . . . , ctm} be the antichain of minimal elements of X, so that $f = 
VKm'^afc- The maximality assumption on e implies that e U {j} contains 
some at for every j E [d] \ e, and so X fl X' is precisely the up-set gener- 
ated by these sets e U {j} for j E [d] \ e. We must therefore show that $g 
is relatively independent from VA,<m ^a,, under /i^ over the common factor 

Vje[d]\e*eU{j}- 

Observe also that since e ^ X we. can find some jk G \ e for each 
k < m. Moreover, each j E [d]\e must appear as some j^. in this list, since 
it appears at least for any k for which a/c C e U {j}. 

Now Lemma l4.2.2l implies that agrees with 7r~^($a^^) up to //^-negligible 
sets. On the other hand, we clearly have 7r~^{^a^) < nJ^iT.), and so in fact 
it will suffice to show that $g is relatively independent from Vje[d]\e 

overV.eHv'^'euor 
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This alteration of the problem is important because it provides the linear 
independence needed to apply Theorem 14.1 .21 Indeed, considering again the 
Z'^+^ -system X, in the present setting we see that the a-subalgebras 

$f and 7Tj\E)forje[d]\e 

constitute a collection of factors of X that are partially invariant under the 
subgroups 

Te := Z(ei-ed+i)+ ^ Z(ei-e^) and Tj := Z(ej-ed+i) for j G [d]\e 

iGe\{i} 

respectively, where z G e is arbitrary. On the one hand these subgroups can 
be inserted into a direct sum decomposition of Z'^+\ and on the other we may 
argue just as in the proof of Propo sition 14 . 2 .41 that the Z'^+^-system defined by 
the factor $g is sated relative to the class Vj6[d]\e ^0"^'"% using our satedness 
assumption on X and Lemma I4.2.5I The conclusion therefore follows from 
Theorem l4.1.2[ 

Step 2 The general case can now be treated for fixed X by induction on 
I'. If I' C X then the result is clear, so now let e be a minimal member of 
X' \ X of maximal size, and let X" := X' \ {e}. It will suffice to prove that if 
F G L°°{jj,^) is $ J/ -measurable then 

E,p(F|$f) = E,p(F|$|n^,), 

and furthermore, by an approximation in || • II2 by finite sums of products, to 
do so only for F that are of the form Fi ■ F2 with Fi and F2 being bounded 
and respectively ^J^y -measurable. However, for such a product we 

can write 

E,P (F I $1) = E,P (E,P (F I $f I $1) = E,P (E,P (Fi | $f ^^,0 ■ F, \ $|) . 
By Step 1 we have 

E^f(Fi I $|ux") = I ^Jxui")n{e))^ 

and on the other hand (X U X") fl (e) C X" (because X" contains every subset 
of [d] that strictly includes e, since X' is an up-set), so (XUX") fl (e) = X" fl (e) 
and therefore another appeal to Step 1 gives 

E^f(Fi I $fxu2")n(e)) = E;,f(Fi I $|„). 
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Therefore the above expression for E^f {F1F2 \ $j) simplifies to 

E^p(E^p(Fi I $1,,) ■ F, I $1) = E^p(E^p(Fi ■ F^ \ $1.) | $f) 

= E,P(E,P(F I $f„) I $f) = E,P(F I = E,P(F | $|^^,), 

where the third equality follows by the inductive hypothesis applied to I" and 
X. □ 



4.3 Infinitary hypergraph removal and comple- 
tion of the proof 

Propositions 14.2.41 and 14.2.61 tell us a great deal about the structure of the 
probability measure fi^ for a system X that is sated relative to all the necessary 
classes in terms of the partially-ordered family of factors 





by showing that large collections of the a-subalgebras appearing here are rela- 
tively independent over the collections of further cr-subalgebras that they have 
in common. 
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It is worth stressing at this point that we have not proved any such assertion 
for the joint distribution of all the original factors $e < but only for their 
oblique copies inside S®'^. The problem of describing the joint distribution 
of the factors $e themselves seems to be much harder, because it runs into 
precisely the difficulties with linear dependence discussed in Section l4~n for 
example, if 61,62,63 C [d] are three subsets that are pairwise non-disjoint, 
then we have = S"^^'""- for Fg. = Xlj/ee ^ ^j')' ^^'^ these three 
subgroups are now clearly not linearly independent. In our analysis of the 
oblique factors $g we carefully avoided a similar problem during Step 1 of the 
proof of Proposition I4.2.6[ where we exploited the fact that $g is contained 
modulo negligible sets in 7r^^^(S) for any choice of j E e, so that by making 
careful choices of the coordinates with which to express these oblique copies 
we were able to reduce the joint distribution of interest to the case covered by 
Theorem 14. 1 .21 involving only linearly independent subgroups. However, it 
seems clear that no similar trick will be available in the study of the factors 

Happily, however, we do not need any such more precise information 
to complete our proof of Theorem 14.0.21 in the remainder of this chapter 
we show how the structure proved above for fi^ suffices. This will proceed 
through a slight modification of Tao's infinitary hypergraph removal lemma 
from UTaoOVH , which first appeared in the form given below in HAusbH . 

Proposition 4.3.1. Suppose that {X, S, fi) is a standard Borel space and A 
is a d-fold coupling of n on (X'^, T,®'^) with coordinate projection maps tt, : 
X''' — > X, and that {'^e)e is a collection of cr-subalgebras ofT, indexed by 
subsets 6 G (1^2) wi^h the following properties: 

[i] ifeCe' then > ^^e',' 

[ii] ifi,j G 6 and A G \E'e then X{tt^^{A)A'Kj''^{A)) = 0, so that we may 
let be the common X-completion of the lifted a -algebras ^^^{^e) 
for i G e; 

[Hi] if we define := \/ for each up- set X G (^^2)' ^' 
subalgebras and are relatively independent under A over "^xni'- 

In addition, suppose that lij for i = 1,2, ... ,d and j = 1,2, ... ,ki are 
collections ofup-sets in (If^) such that [d] G Xj j C {i) for each and that 
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the sets Aij G $i. ^ are such that 



d ki 

Kn(n-^«)) 



0. 



i=i j=i 



Then we must also have 



d ki 



Knn^-)=o- 



i=ij=i 



Proof of Theorem 14.0.21 from Proposition 14.3.11 Clearly the conclusion 
holds for a system X if it holds for any extension of X, so by Theorem 12.3 .21 
we may assume that X is Cg-sated for every e C [d]. 

Now suppose that Ai, A2, . . . , Ad E S are such that n^{Ai x A2 x ■ ■ ■ x 
All) = 0. Then by Proposition |4.2.4| we have 



The level set Bi := {E^{1a \ ^{i)) > 0} (of course, this is unique only up 
to yU-negligible sets) lies in and the above vanishing requires that also 
/i^(-Bi xi?2 X • ■ -xBd) = 0. Now setting A;i = l,Xi,i := {i) and A^^i := i?j for 
each i < d. Lemma 14.2.21 and Proposition 14.2.61 imply that Proposition 14.3.11 
applies to the partially invariant factors $e and their oblique copies to give 
fi{Bi n ^2 n ■ ■ ■ n i?d) = 0. On the other hand we must have fi{A \Bi) =0 
for each and so overall /i( A) < fi{BinB2n- ■ ■nBd) + Y.t=i f^{A\Bi) = 0, 
as required. □ 

The remainder of this chapter is given to the proof of Proposition 14.3.11 
This proceeds by induction on a suitable ordering of the possible collections of 
up-sets (Xj j )j j, appealing to a handful of different possible cases at different 
steps of the induction. At the outermost level, this induction will be organized 
according to the depth of our up-sets. 

The proof given below is taken essentially unchanged from HAusbL where 
in turn the statement and proof were adopted with only slight modifications 
from nTao07ll . The reader may consult HAusbll for an explanation of these 
modifications. 
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Definition 4.3.2. A family (Xij)ij has the property P if it satisfies the conclu- 
sion of Proposition \4. 3.7] 

We separate the various components of the induction into separate lem- 
mas. 

Lemma 4.3.3 (Lifting using relative independence). Suppose that all up-sets 
in the collection have depth at least k, that all those with depth exactly 

k are principal, and that there are £ > 1 of these. Then if property P holds for 
all similar collections having i — 1 up-sets of depth k, then it holds also for 
this collection. 

Proof LetXj^j^ = {ei),!^^^^ = {^2)-, ■■■^^ie,je = i^e) be an enumeration 
of all the (principal) up-sets of depth k in our collection. We will treat two 
separate cases. 

First suppose that two of the generating sets agree; by re-ordering if nec- 
essary we may assume that ei = 62- Clearly we can assume that there are no 
duplicates among the coordinate-collections for each i separately, so 

we must have ii ^ 12. However, if we now suppose that Ai^j e Ijj for each i, 
j are such that 



i=i j=i 

then by assumption [ii] the same equality holds if we simply replace Ai-^j-^ G 
(ei) with A'-^ := Ai^^j^ H Aj^ and Ai^^j^ with A-^ '■= X. Now this last set 
can simply be ignored to leave an instance of a A-negligible product for the 
same collection of up-sets omitting j^, and so property P of this reduced 
collection completes the proof. 

On the other hand, if all the Cj are distinct, we shall simplify the last of 
the principal up-sets Xj^ by exploiting the relative independence among the 
lifted cr-algebras \E'J. Assume for notational simplicity that {ie,je) = (1, 1); 
clearly this will not affect the proof. We will reduce to an instance of property 
P associated to the collection (X- j ) defined by 



which has one fewer up-set of depth k and so falls under the inductive as- 
sumption. 





49 



Indeed, by property [iii] under A the set n-^ ^(^1,1) is relatively indepen- 
dent from all the sets 7rj"^(Aij), (1, 1), over the (7 -algebra 7rf^(^(ef>\{ef})> 
which is dense inside u / \- Therefore 

i=l j=l 

n fci d ki 

•'A" ■_r, -o 



j=2 i=2 j=l 



Setting a; 1 := {E^(U, , | ^(e,>\{e,}) > 0} G ^(e,)\{e,} and A'^j := A,,, for 
(i, j) 7^ (1, 1), we have that 1 \A[^) = and it follows from the above 

equality that also A( ( Hjii ^L)) ^ ^' appeal to property P for 
the reduced collection of up-sets completes the proof. □ 



Lemma 4.3.4 (Lifting under finitary generation). Suppose that all up-sets in 
the collection have depth at least k and that among those of depth k 

there are i > 1 that are non-principal. Then if property P holds for all similar 
collections having at most £ — 1 non-principal up-sets of depth k, then it also 
holds for this collection. 

Proof Let Ti-^j-^, Xj^jj, . . . , T^j^ be the non-principal up-sets of depth k, 
and now in addition let ei, 62, . . . , be all the members of X^^ of size 
k (so, of course, r < (^)). Once again we will assume for simplicity that 
{ii: je) = (1)1)- We break our work into two further steps. 

Step 1 First consider the case of a collection {Aij)ij such that for the 
set Ai^i, we can actually find finite subalgebras of sets Bs G ^{ej for s = 
l,2,...,r such that A^j^ G fii V V ■ ■ ■ V V ^'^-^ [d] (so lies 
in one of our non-principal up-sets of depth k, but it fails to lie in an up-set 
of depth k + 1 only 'up to' finitely many additional generating sets). Choose 
M > m&Xs<r \Bs\, so that we can certainly express 

Af 

^1,1 = 1^ [Bm,! n Bm,2 H ■ ■ ■ fl B^^r H Cm) 
m=l 
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with Bm s ^ for each s <r and £ n(' i'*! V Inserting this expres- 
sion into the equation 

^(n(n^«))=o 

i=i j=i 

now gives that each of the M^" individual product sets 

ki d ki 

j=2 1=2 j=l 

is A-negligible. 

Now consider the family of up-sets comprising the original Xij if i — 

2,3, ... ,d and the collection (ci), (62), . . . , (e^), Xi 2, Xi 3, . . . , Xi corre- 
sponding to z = 1. We have broken the depth- A; non-principal up-set Xi 1 into 
the higher-depth up-set In fl (>|f|]^) and the principal up-sets (e^), and so 
there are only i — 1 minimal-depth non-principal up-sets in this new family. 
It is clear that for each m < the above product set is associated to this 
family of up-sets, and so an inductive appeal to property P for this family tells 
us that also 

ki d ki 

fi({B^,i n s^,2 n • • • n s^,, n C^) n f] A^j n fj f] A^j^ = o 

j=2 i=2j=l 

for every m < M^. Since the union of these sets is just f]f^i fljii Aj> this 
gives the desired negligibility in this case. 

Step 2 Now we return to the general case, which will follow by a suit- 
able limiting argument applied to the conclusion of Step 1. Since any \I/e is 
countably generated modulo for each e with |e| — kwe can find an increas- 
ing sequence of finite subalgebras Be,i C Be^2 Q ■ ■ ■ that generates up to 
/x-negligible sets. In terms of these define approximating sub-cr-algebras 

^x.„n(,^J V V 

eex,,,n(M) 

so for each Xj ^ these form an increasing family of a-algebras that generates 
j up to /^-negligible sets (inded, if Xj j does not contain any sets of the 



51 



minimal depth k then we simply have e[^J — "^fi. . for all n). Now prop- 
erty [iii] implies for each n that ^^^^ ^ and i) '^T^i'^ij) relatively 

independent over 7rf 

Given now a family of sets (Ai^j)ij associated to for each (i, j) 

the conditional expectations E^(l^- ^ | Sj-"-*) form an almost surely uniformly 
bounded martingale converging to 1^.^. in Letting 

{E,(U,,^ I ^) >l-5} 

for some small 5 > (to be specified momentarily), it is clear that we also 
have iJ,{AijABl^j) — > as n — > oo. Let 

d ki 



^-n(n4?)- 



i=i j=i 

We now compute using the above-mentioned relative independence that 

X{F\7r-\A,)) 



= / (IfiWxA • ° ^0 ■ ( IT ^Bin) OTTiAdX 

for each pair 

(n) 

However, from the definition of B- ■ we must have 

''iJ 

almost surely, and therefore the above integral inequality implies that 

A(F \ 7r-^(Aj)) < ^ [ (l^w o TT,) • ( n ° ^0 = 
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From this we can estimate as follows: 



d ki d 

< A(n(n ^0) < 0+ 

1=1 j=l (ij) i=l 

and so provided we chose 6 < ( X]f=i ^i) ^ must in fact have A(F) = 0. 

We have now obtained sets (-BI"'*),^ that are associated to the family 
(Xj j) j J and satisfy the property of lying in finitely-generated extensions of the 
relevant factors corresponding to the members of the Xij of minimal size, and 

so we can apply the result of Step 1 to deduce that yu( HiLi CfjLi ^t^j) = 0- 
It follows that 

d ki 

/^( n n ^^'0 ^ E -"(^^'i \ 4?) as n ^ oo, 

i=i j=i ij 

as required. □ 



Proof of Proposition |4.3.1| We first take as our base case h = 1 and Xj i = 
{[d]} for each i = 1,2, ... ,d. In this case we know from property [ii] that 
for any A E \E'[d] the pre-images tt^^{A) are all equal up to negligible sets, 
and so given Ai, A2, Ad G ^[dj we have = X{Ai x A2 x ■ ■ ■ x Ad) = 

fi{AinA2n---nAd). 

The remainder of the proof now just requires putting the preceding lem- 
mas into order to form an induction with three layers: if our collection has any 
non-principal up- sets of minimal depth, then Lemma l4 . 3 .41 allows us to reduce 
their number at the expense only of introducing new principal up- sets of the 
same depth; and having removed all the non-principal minimal-depth up-sets. 
Lemma 14.3.31 enables us to remove also the principal ones until we are left 
only with up-sets of increased minimal depth. This completes the proof. □ 
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Chapter 5 

The Density Hales-Jewett 
Theorem 

Much as for Szemeredi's Theorem and its mukidimensional generalization, 
the Ergodic Ramsey Theory approach to Theorem B begins by establishing its 
equivalence to a result about stochastic processes. We have deferred the intro- 
duction of the stochastic processes analog of Theorem B until now because it 
involves a less well-known family of processes than the tuples of commuting 
transformations that appear in Theorem A, and these new stochastic processes 
require a separate discussion. The proof from [IFK91 1 of the correspondence 
between Theorem B and an assertion about these processes is also less well- 
known, and so we recall this in the first section below for completeness. 

After formulating the stochastic processes result to which Theorem B is 
equivalent, we introduce an additional semigroup F of transformations on 
these processes and argue that we may reduce further to the case of processes 
whose distributions are invariant. This leaves us with a class of F-systems, on 
which we will bring a notion of satedness to bear. However, as promised at the 
beginning of Chapter 2, this first requires some modifications to that notion, 
effectively by imposing additional restrictions on the factor maps we allow 
in our theory of a kind not involved heretofore. With these modifications in 
place we will proceed to analogs of Propositions 14.2.41 and 14.2.61 and thence 
to the proof of Theorem B. 
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5.1 The correspondence with a class of stationary 



processes 
Combinatorial notation 

In addition to the finite spaces [k]^ appearing in the statement of Theorem B, 
we will work with their union 



The spaces [k]^ and [k]* are referred to as the A'^-dimensional and infinite- 
dimensional combinatorial spaces over the alphabet [k] respectively. Most 
of this chapter will consider probabilities on product spaces indexed by [k]*. 
If Ac [k]^ then we denote its density by 




thus the assumption of Theorem B is that N is sufficiently large in terms of k 

and d{A). 

Given two finite words u,v G [A;]* we denote their concatenation by either 
uv or u ® V. For any finite n we define an n-dimensional subspace of [k]* 
to be an injection cf) : [/c]" ^ [k]* specified as follows: for some integers = 

No < Ni < N2 < . . . < Nn, nonempty subsets h C [Ni], h Q [N2] \[Ni], 
In ^ [Nn] \ [Nn-i] and a word w e [k]^" we let (j){viV2 • • ■ Vn) be the 
word in [k] * of length iV„ given by 



In these terms a combinatorial line is simply a 1 -dimensional combinato- 
rial subspace. 

Similarly, an infinite-dimensional subspace (or often just subspace) of 

[k]* is an injection : [/c]* "— )■ [k]* specified using some infinite sequence 
= A^o < -/Vi < A^2 < ■ ■ nonempty subsets li+i C [Ni+i] \ [Ni] and 
words Wi E [k]^\ where for any v E [A;]" its image has length Nn and is 
given by the above formula with w := w„. It is clear that the collection of all 
subspaces of [k] * forms a semigroup T under composition. 



w U w 



N>1 




if me [NnWihUhU- 
if m e /j. 



U/n) 
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Finally, let us define letter- replacement maps: give i E [k] and e C [A;], 
for each > 1 we define r^^ : [k]^ — > [k]^ by 




if Wm E [k]\e 



for m < N, and let 

re, := U : [k]* ^ [k]* 

N>1 

(so clearly r^^i actually takes values in the subset {{[k] \ e) U {i})* C [A;]*). 



Reformulation in terms of stochastic processes 

The correspondence that Furstenberg and Katznelson establish for Theorem B 
is between dense subsets of the finite-dimensional combinatorial spaces [k]^ 
and stochastic processes indexed by the infinite-dimensional combinatorial 
space [k]* . 

Theorem 5.1.1 (Infinitary Density Hales-Jewett Theorem). For any S > 0, if 
jj is a Borel probability measure on {0, 1 with the property that 

/i{x G {0, Ijl'^l* ■ x^ = l}>5 Vw G [k]\ 

then there is a combinatorial line : [/e] [k] * such that 

/i{xG {0,1}1'^1* : = IV^ G [A;]} > 0. 

Proof of Theorem B from Theorem 15.1.11 Clearly we may restrict our at- 
tention to A; > 2. We will suppose that theorem B fails, and show that this 
would give rise to a counterexample to Theorem |5.1.1[ We break this into two 
steps. 

Step 1 First observe that if iV > L > 1 and A C [A;]^ has d(A) > 
1 — then A necessarily contains a whole L-dimensional combinatorial 
subspace. Indeed, having density as high as this implies that each of the k^ 
subsets 

Au := {w G [A;]^-^ : u®w eA} for u G [k]^ 
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has density greater than 1 — and so there must be some w G fluelfc]^ 
implying that the subspace [k]^ ^ [k]'^ : u ^-^ m©w has image lying entirely 
in A. 

In particular, letting L = 1, if we assume that Theorem B fails then we 
may let 

5q := sup{5 > : Theorem B fails for subsets of density 5} 

and deduce that < (5o < 1 . 

Step 2 Now fix some integer L > 1 and let A C [fc]^ be a subset of 
density d{A) = 5 > (1 + 2fc^)^'^o some N > L such that A contains 
no combinatorial lines. 

Let iV = L + M and decompose [k]^ as [k]^ © [A;]*^ For each w e[k]^ 

let 

A^ := {v G [kf ■ w®veA}. 

Clearly 

^ d(A^) = d(A) = 5, 

and on the other hand d(y4^) < (1 + 2^1+1 for ^ach once is sufficiently 
large, for otherwise Ay^ would contain a combinatorial line by the definition 
of 5q. Therefore the above equation between densities and Chebyshev's in- 
equality require that in fact every w G [k]^ have d{Aw) > S/2. 
Now defining the probability measure /il on {0, 1}'''! by 

I^L{ixr,)u,e[k]L} ■= d{{v G [k]^'^ : = Vu; G [k]^}) 

for each (a;t^,)^g[fc]L G {0, we see that for each L we have produced a 
probability /i^ on {0, Ij'^^^ such that 

/iL{x G {0, : x^ = l} = d{A^) > 6/2 > 5o/4 \/w G [A;]^ 

but 

/iz.{xG{0,l}W": a:^(,) = 1V2G [fc]} = 

for any combinatorial line (p : [k] ^ [k]^. Finally defining /i := (S)L>i/^i' 
we obtain a measure that contradicts Theorem 15 . 1 . 1 1 with density 5o/A. □ 
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Remark The above proof is essentially taken from Proposition 2.1 of [IFK91I I. 
where the reverse implication is also proved. < 



5.2 Strongly stationary processes 

After introducing Theorem 15.1. 11 Furstenberg and Katznelson make a further 
reduction to a special subclass of measures. 

Definition 5.2.1 (Semigroup action of combinatorial subspaces). Ifcj) : [k]^ ^ 
[k]* is a combinatorial subspace then for any product space K^'^^* we define 
the corresponding map : KI^I* — y K^'^^ by 

(r0(x))„ := form G [k]^ andy. = {xujuelk]* e K^''^' , 

and similarly define : t'^l* — > K^'^^* in case : [/c]* )■ [k]*. In the latter 
case this specifies an action V r\ K^^^* . 

Definition 5.2.2 (Strongly stationary laws). A probability measure jj, on the 
product (irW*,^®W*)/or some standard Borel space {K, \E') is strongly sta- 
tionary if T^^ii = fi for all subspaces G F. In this case the transfor- 
mations give to (K^'^^* , \E''^['^1*, /i) the structure of a probability-preserving 
T -system. 

Lemma 5.2.3. If Theorem 15.7.71 holds for all strongly stationary measures 
for any 6 > then it holds for all measures satisfying the conditions of that 
theorem for any 5 > 0. 

Proof This argument is again lifted directly from [IFK91I . and we only 
sketch the details. Given a measure fi satisfying the conditions of Theo- 
rem |5.1.l1 for some 5 > 0, by applying the Carlson-Simpson Theorem HCarSSH 
to arbitrarily fine finite open coverings of the finite-dimensional spaces of 
probability distributions on {0, llt*^!" for increasingly large n, we obtain a 
subspace ip : [k]* [k]* and an infinite word w = wiW2 ■ ■ ■ G [k]^ such that 
the restricted laws 

T',P(^lUj^W2---Wm(B ■ )#l^ 
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converge to a strongly stationary law as m — > oo, and since all one-dimensional 
marginals of the input law gave probability at least S to {1}, the same is true 
of the limit measure. Finally, the subset of probability measures 

{u G Pr{0, l}!'^]* : z/{x G {0, Ijl'^'* : = 1 < A;} > O} 

is finite-dimensional and open for any given line (p : [k] ^ [k]*, so if the limit 
measure is in this set the so is some image of the original measure. □ 

An immediate consequence of the strong stationarity of a measure /i is that 
for any two A^-dimensional subspaces 0, V' : [k]^ [k]* we have T^^/i = 
T^#/i. In case = we refer to this common image measure as the point 
marginal /i and denote it by /i^*, and similarly in case = 1 it is the line 
marginal of /i and is denoted by /i^™*^. In these terms it is possible to give 
another, more convenient reformulation of Theorem 15. 1.11 

Theorem 5.2.4. If {K, \E') is a standard Borel space and jj, is a strongly sta- 
tionary law on (i^W*, vl/'^W*) then for any Ai,A2, . . . ,Ak G we have 

/^"(Ai X ^2 X ■■■ X Afc) = /iP*(v4inA2n---nAfc) = 0. 

The resemblance to Theorem l4.0.2l is far from accidental! 

The proof of Theorem 15.2.41 will involve a version of satedness for our 
systems of interest; however, here a slight subtlety creeps in. In the follow- 
ing we will need to work with only those F-systems that are of the form 
(f^W* v]>W*^ T) for some strongly stationary measure ji (of course, the 
huge semigroup F could also have invariant measures for all sorts of other 
Borel actions, not of this form). On the other hand, the conclusion of The- 
orem 15.2.41 is not about the joint distribution of several copies of whole F- 
systems under some self-joining. Rather, it is about the joint distribution of 
some copies of just the 'one-dimensional' point marginal (K, under 
the line marginal: this is only a tiny fragment of the whole system (K^''^* , v|/®['=l* ^ 

The way we can keep track of the structure of point and line marginals 
between different such systems is by restricting the kinds of factor map we 
allow. 

Definition 5.2.5. Let A be the class of V -systems given by strongly stationary 
measures on product spaces indexed by [k]*, as above. 
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A coordinatewise factor (or cw-factor) ofX= (ift'^l*, /i, T) G A 

is a a-subalgebra of the form <|)'^W* < ^i®^^^' for some $ < Slightly abu- 
sively, we will sometimes refer instead to the single-coordinate a-subalgebra 
^ as a cw-factor Likewise, a cw-factor map is a map of the form 

f* , (irW*,vl/^W*,^,T) ^ (LW*,H^W*,j.,T) : (x.)^ ^ (/(x.))^ 

for some Borel map f : {K, — > (L, E), and f* is a cw -isomorphism if f 
is measurably invertible away from some /i^*- and -negligible sets (this is 
clearly equivalent to its being an isomorphism in the usual sense ). 

With f* as above we shall sometimes refer to f as its corresponding 
single-coordinate map. 

It is now easy to see that the class A is closed under joinings and inverse 
limits, provided that we interpret a joining of two systems {K^''^* , , T) 
and S®''']*, z/, T) as a strongly stationary measure on {K x L)^^^* and 

that we restrict our attention to inverse sequences whose connecting maps 
are all cw-factor maps. We will henceforth refer to a subclass C C A as 
cw-idempotent if it is closed under cw-isomorphisms, joinings and inverse 
limits involving cw-factor maps, and now observe that all of the definitions 
and lemmas of Section [Z2l have direct analogs for cw-idempotent classes ob- 
tained simply by insisting that all morphisms be given by cw-factor maps. In 
particular, if C is a cw- idempotent class and X = (/sfW*, ^^W*, /x, T) G A 
then the maximal cw-C-factor of X is given by $'^['^1* where $ is the maximal 
cr-algebra in the family 

{S < \1' : S is generated by some Borel map / : (fC, ^) — )■ {Ki, \E'i) 

such that (irfl*, /;/i,T) G C}. 

We will write a cw-factor map coordinatizing this maximal C-factor as Cc for 
some map (^c '■ K — )■ CK of single-coordinate spaces. 

Given these observations we can make our analog of Definition [23TTJ 

Definition 5.2.6 (CW-sated systems). For a cw-idempotent class C C A, a 
system ^ E Ais cw-C-sated if for any cw-extension 

TT* : X = (i^rt'^l* , ^^['^l* , /i, T) ^ X 
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the single-coordinate maps vr : K — > K and (c '■ K — > CK are relatively 
independent under jF^ over (c o tt : K — > K — > CK, where and Cc 
coordinatize the maximal C-factors of^ and X respectively. 

Theorem 5.2.7. If{Ci)i^j is a countable family of cw-idempotent classes then 
any system Xq G A admits a cw -extension tt : X — > Xq that is cw-Ci- sated 
for every i E I. 

Proof outline This proceeds in exact analogy with the proof of Theorem |2.3.2[ 
First, applying the argument for Lemma [2 .3 .31 to a bounded measurable func- 
tion / on the single-coordinate space of an inverse limit shows that an inverse 
limit of cw-C-sated systems through cw-factor maps is cw-C-sated. 
Next, given a system 

X = (irW*,^®w*,/i,T) e A, 

we show how to produce a cw- sated extension for a single cw-idempotent 
class C: first enumerate an L^-dense sequence (/r)r>i in the unit ball of 
then apply the same exhaustion argument as in Step 1 of Theo- 
rem [23^ to produce an inverse sequence of cw-extensions 

■■■ r — ^ -t ) 

x„ = (irW-,vi/^[nr,^^,T)^'^^ ...-^x 

such that for each r it happens cofinally often that this extension is within a 
factor of 2 of achieving the optimal increase in the L^-norm of the conditional 
expectation E^pt(/r o ip"^ \ (^'^) (where is the single-coordinate map co- 
ordinatizing CX„); and finally take the inverse limit of this sequence. Just 
as in the proof of Theorem |2.3.2[ if this inverse limit were not cw-C-sated 
then this would lead to a contradiction with our assumption on the increase of 
II E^pt(/r o I (^^) II 2 for some finite n. 

Finally the proof is completed by arguing that given a countable collec- 
tion of cw-idempotent classes Cj, we can produce one long inverse sequence 
of extensions in which for each i there is a cofinal subsequence of cw-Cj-sated 
systems, so that the inverse limit is cw-Cj-sated for every i. □ 
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This completes the modifications we need for our approach to Theorem B. 
Note that detailed proofs of the above results written in the setting of strongly 
stationary laws are given in [jAusal . 

Remark In principle one could give a complete unification of Chapter [2] 
with the above modifications to it by phrasing all of these results in terms 
of a general (not necessarily full) subcategory Cat of the category F-Sys of 
all F-systems, and adopting a flexible meaning for the term 'relatively inde- 
pendent'. In this work we have preferred to draw a more informal parallel 
between our two settings of interest, but it may be instructive to deduce from 
the proofs of Chapter 2 what basic properties we really need for the existence 
of sated extensions and the various lemmas that support it. Although we leave 
the proof to the reader, it turns out that Cat must admit two basic construc- 
tions: 

• it must have inverse limits; 

• it must have generated factors: that is, if 




is a diagram in Cat, then there is an essentially unique minimal system 
W that may be inserted into this diagram as 




Note, interestingly, that it does not seem to be essential that any diagram 
such as 
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have a common extension of X and Y that can be inserted above it (of course, 
working in the whole of F-Sys when F is a group such a common extension 
is provided by the relatively independent product). 

While these assumptions on Cat are relatively innocuous, more drastic 
steps are needed if we are to accommodate the instances of relative inde- 
pendence appearing in both Theorem 12.3.21 and Theorem |5.2.7[ The former 
of these asserts the relative independence of two whole factors of some ex- 
tended system, whereas the latter concerns only the relative independent of 
functions of a single fixed coordinate within each of those factors (that is, 
relative independence under jjP^ rather than fi). In order to treat these to- 
gether, one could for example augment the category Cat by attaching to each 
system some distinguished subalgebra of bounded measurable functions (the 
whole of L°° in the first case, and the subalgebra of functions of for some 
distinguished w E [k]* in the second), and then re-defining conditional ex- 
pectation as an operator acting only between these subalgebras for different 
systems and satisfying the usual conditions of idempotence and agreement of 
integrals against functions in the target subalgebra. 

Altogether these very abstract considerations seem more demanding than 
worthwhile, and I know of few other situations in which a non-standard ex- 
ample of an abstract category of systems having these properties has been 
useful in ergodic theory. One related area which could fit into this mould is 
the study of partial exchangeability in probability theory, for which we refer 
the reader to Kallenberg's book ||Kal02|| , the survey papers HAusOSilAldl and 
the references given there. < 



5.3 Another appeal to the infinitary hypergraph 
removal lemma 

The cw-idempotent classes for which we will apply Theorem 15.2.71 are as 
follows. 

Definition 5.3.1 (Partially insensitive processes). Given a subset e C [k], a 
process {K^''^\ \E'®1'^]*, /x, T) G A is e-insensitive if its line marginal satisfies 

Xi = Xj for (xi, X2, ■ ■ ■ , Xk) E for all i,j G e. 
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We write Ag C A/or the subclass of all e-insensitive processes. 

The persistence of e-insensitivity under inverse limits and joinings is im- 
mediate, and so we have: 

Lemma 5.3.2. The class Ag is cw-idempotent for each e C \d\. □ 

In parallel with the developments of Section l43l given an arbitrary process 

X= (KW*,^®W*,/i,T) e A, 

for each e C [d] we let $e denote the e-insensitive a-subalgebra of con- 
sisting of those A G ^ such that ii^'"''{'kI'^{A)/\'kJ^{A)) = for all i,j e e, 
where vTj : K'^ — > K is the coordinate projection. Letting Ce '■ (K, \[') — > 
{Kf., \E'e) be some map of standard Borel spaces such that $e agrees with 
{C^^{E) : E G ^e} modulo /iP*-negligible sets, it follows that 

is a cw-factor map that coordinatizes X — > AgX. 

Directly from the definition of $e we observe that if i,j E e then vr,^^ ($e) 
and 7r^^^(<l'e) differ only by ^u^^^'^-negligible sets, and we denote their common 
/i'^'^-completion by If now I C (^^) is an up-set, then similarly to the 

setup of Section Sill we define ■= Veex ^^'^ •= Veex ^l- 

In terms of these definitions, the consequences of cw-satedness that we 
need are now essentially parallel to Propositions 14.2.41 and l4.2.6l 

Proposition 5.3.3. For each i < k let 

Q := y A{ij}. 

If a system X with strongly stationary measure jj, is cw-Ci-sated for each i 
then the a-algebras 7r,^^(\E') < are relatively independent under yU^™*^ 
over the further factors 

j<k,j^i 
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Proof Clearly it will suffice to prove that tt^ ^(^) is relatively independent 
from 7r^^(\&) V ■ ■ • V Tr^^i"^) under ^u^™'' over 

k 

i=2 

since the cases of the other coordinates under yU^™*^ then follow by symmetry. 

We prove this by contradiction, so suppose that /i, /2, • • • , /d G L°°(/iP*) 
are such that 

/ /i®/2®---®/fcd/"V [ E(/i|S)®/2®---®/fcd/- 

We will deduce from this a contradiction with the cw-satedness of By 
replacing /i with /i — E(/i | H) it suffices to assume that E(/i | S) = but 
that the left-hand integral above does not vanish. 

For each j = 2, 3, . . . , A; recall the letter-replacement map r{i j} j : [k]* — ) 
[k]* defined in Section [STl In view of the strong stationarity of /i, we may 
transport the above non- vanishing integral to any combinatorial line in \k]*: 
in particular, picking some w E [k]* for which w^^{j} ^ for every j, the 
points {w,r{i_2},2(w),ni,3},3('?^), • • • , '"{i,fc},fc(w^)} form such a line, and so we 
have 




fiM ■ /2(a;r{i,2},2("')) /fc(a;qi,fe},,(«;))/i(dx) = K ^ 0. 



Now define the probability measure A on {K x i^{2.3v,fc})[fc]* ^-j^g 
joint law under fi of 

We see that all of its coordinate projections onto individual copies of K are 
still just the cw-factor map 

01 • (-^u)) y2,wi y3,w: • • • ; yk,w)w ' ^ {Xw)w 

has A = /i, and the cw-factor map 

0j • y2,wi Z/3,ui; • • • ; yk,w)w ' ^ (Z/j,W))ui 
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for j = 2,3, ... ,k is A-almost surely {1, j}-insensitive. Therefore through 
the cw-factor map 0* the law A defines an extension of as a measure space. 

This new measure A may not be strongly stationary, so may not define 
an extension of members of A. However, we can now repeat the trick of 
Lemma I5.2.3I By the Carlson-Simpson Theorem there are a subspace : 
[k]* [k]* and an infinite word w G [A;]^ such that the pulled-back measures 

converge in the coupling topology on {K x f^{2.3,. -,fc}yA:]* (j-g^all that for cou- 
plings of fixed marginal measures this is compact; see Theorem 6.2 in PGlaOSH ) 
to a strongly stationary measure fi. Since fi was already strongly stationary, 
we must still have (pX^Ji = fi, and by the definition of the coupling topology 
as the weakest for which integration of fixed product functions is continuous 
it follows that we must still have, firstly, that 



(/ o 7r„ o 0*) ■ TT {hj o 7r„ o 0*) d/i = K 7^ 

for each u G [k]* (where now we may omit the assumption that u contains 
every letter at least once, by strong stationarity), and secondly that the cw- 
factors generated by the maps 0* are {l,j} -insensitive under ji, since this is 
equivalent to the assertion that for any A G \1/ and line ^ -.[k]"-^ [k]* we have 

/ ■ lK\A{(t)j{zti^j))) /i(dz) = 

J(ii'xi<'{2.3,...,fe})[fc]* 

and this is clearly a closed condition in the coupling topology. 

It follows that this strongly stationary measure jl gives a genuine cw- 
extension 0^ : X — )■ X such that the lift of /i o tti as a function of any 
one coordinate must have a nontrivial inner product with some pointwise 
product of {1, J } -insensitive functions under ji over j = 2,3, ... ,k. Hence 
this lift has nonzero conditional expectation onto a a-subalgebra of ^ (g) 
v|>®{2,3,...,fc} coordinatizing a cw-factor in the class Ci, but recalling our as- 
sumption that E(/i I H) = 0, this provides the desired contradiction with cw- 
Ci-satedness. □ 
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Proposition 5.3.4. For each e C [k] let 

ie[d]\e 

7/' X is cw-Ce-sated for every e then for any upsets 1,1' C the a- 
subalgebras $j and are relatively independent under /x^™^ over ^x^x'- 

Proof As for Proposition 14. 2 . 61 we start with the case in which I' = (e) for 
e a member of [l^^) \ X of maximal size, and again just as for that proposition 
it suffices to show that is relatively independent from Vje[fe]\e ^7^(^) 

y^mV^ui,} under/.!-. 

Again this is best proved by deriving a contradiction with cw-satedness. 
Pick some i G e, so $t agrees with 7rj"^($e) up to negligible sets, let 

je[k]\e 

and suppose we have some / G L°°(/iP*) that is $e-measurable and such that 
E(/ I S) = 0, and also hj G L°°{^^^) for each j G [/c] \ e such that 

ie[fc]\e 

Arguing as for the preceding proposition, this nonvanishing can be trans- 
ported to any combinatorial line in [k]* , including to a line such as {re,i(w), 
re,2{w), re,3(w), . . . , Te^kiw)} for any w that contains every letter at least once. 
This gives 



K 

V 

for any such w, but since / is e-insensitive we may replace the first factor in 
this integrand simply by f{xw). 

It follows that if we define the probability measure \ on [K x K^'^^'^'^y^^* 
to be the joint law under fx of 

i^w)w ' ^ i^^wj (■^»'e,j (w)) jG[fe]\e) ^ 
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then all of its coordinate projections onto individual copies of K are still just 
/iP*, the cw-factor map 



w 



has </)lA = /i and the cw-factor maps 



0* : {x^, (l/j>)je[fc]\e)^ ^ {yj,w) 



w 



are A-almost surely (e U {j})-insensitive. Therefore through 0* the measure 
A is an extension of the measure n, and the above inequality gives a non-zero 
inner product for the lift of / o tt^ through 0* with some product over j G [k]\e 
of (e U {j})-insensitive functions under A, which we can express as 



for any u E [k]* that contains each letter at least once. 

To complete the proof, we may argue exactly as for Proposition 15.3.31 
that within the not-necessarily-strongly-stationary law A we can find infinite- 
dimensional subspaces for which the corresponding image measures under 

converge in the coupling topology to a strongly stationary extension fl of 
H, and such that this extension preserves the feature that the lift of / o vr^ has 
a nontrivial inner product with a pointwise product of (e U {j})-insensitive 
functions under jl^^ for any word u. By our assumption that E(/ | H) = this 
gives a contradiction with cw-Cg-satedness, as required. 

The general case can now follows by induction on I' for each fixed X ex- 
actly as for Proposition |4.2.6[ □ 

Proof of Theorem 15.2.41 An initial application of Theorem 15.2.71 allows us 
to assume that X is cw-sated for all the classes involved in Propositions 15.3.31 



Next, exactly as for the proof of Theorem |4.0.2[ applying Proposition |5.3.3l 
shows that it suffices to prove Theorem 15.2.41 in case the sets Ai lie in the a- 

subalgebras = Vie[d]\{i} ^{iJ} ^ 

Finally, it follows from the definitions and Proposition l5.3.4l that the prob- 
ability space (K, \E', /i^'), its self-coupling and the cr-subalgebras $e and 




je[fc]\e 



and [5341 



68 



their lifts for e C [d] satisfy all the conditions of the 'infinitary removal 
result' Proposition 14.3.11 so another appeal to that proposition completes the 
proof. □ 



Postscript to the above proof 

After the appearance of Furstenberg and Katznelson's original, technically 
rather demanding proof of Theorem B in [,FK91 1, considerable efforts were 
made to provide firstly a simpler proof, and more importantly one that could 
be made effective to deduce some quantitative bound on the necessary depen- 
dence of Nq and 6 and k. 

Both of these goals were recently achieved by a large online collaboration, 
instigated by Tim Gowers and involving several other mathematicians, called 
Polymath 1. Importantly, their new proof does give a dependence of Nq on 5 
and k similar to the dependence obtained for the Multidimensional Szemeredi 
Theorem by using the hypergraph regularity and removal lemmas. All these 
developments can be found online ([Polb]) and in the preprint UPolaH . 

Importantly, the infinitary proof of Theorem B that we have reported above 
relies on an observation that was originally taken from their work. I will not 
attempt an exact translation here since the lexicons of these two approaches 
are very different, but the outcome for stochastic processes is essentially 
the observation that an initially-given system X G A can be combined in 
a strongly stationary joining with some {l,j} -insensitive systems as in our 
proof of Proposition |5.3.3[ which then gives some information on the struc- 
ture of the original process X (in our case by an appeal to cw-satedness). 
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Chapter 6 



Coda: a general structural 
conjecture 

It seems inadequate to finish this dissertation without discussing at least some 
of the issues obviously left open by the preceding chapters. Perhaps most 
interesting for ergodic theory is the meta-question introduced in Section l4n 
and in this last chapter I offer a few further speculations on what additional 
answers to it we might hope for. 

Our first clue in this direction is offered by the works HHKOSII of Host 
and Kra and [|Zie07ll of Ziegler, establishing the special case of Theorem C 
corresponding to different powers of a single ergodic transformation: that is, 
the resuh that if T : Z r> (X, S, /x) is ergodic and /i, f2, fd e L'^ifJ^) 
then the averages 

1 ^ 

SNifu /2, . . . , /d) := ^ o T") ■ if, o T'^) (/, o T'^") 

n=l 

converge in /^^(/x) as — > oo. Importantly, those two works both rest on a 
quite detailed result about 'characteristic factors' for these averages: 

Theorem 6.0.5 (Host-Kra Theorem). T/'X = (X, S,/i,T) is as above then 
there is a factor $ < E that is characteristic for the averages Sn in the sense 
that 

SNifu /2, . . . , fd) ~ S^(E(/i I $), E(/2 1 $), . . . , E(/, I $)) 
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in L'^ifi) for any fi, f2, ■ ■ ■ fd ^ L^{lA ^ — ^ oo, and which can be 
generated by a factor map to a {d — l)-step pro-nilsystem: that is, it can be 
generated by some increasing sequence of factor maps 

■Kn : (X, S,/i,T) — > (G„/r„,Borel,mG„/r„,i?3J 

to systems that are given by rotations by elements Qn on compact {d — \ )-step 
nilmanifolds Gn/'^n- 

Remark This notion of a characteristic factor is just a slight modification 
to that of a partially characteristic factor that we met in Proposition l3.3.1[ In 
fact, Ziegler proves in [|Zie07ll that there is a unique minimal factor with the 
properties given by the above theorem, and in Leibman's later treatment of 
these two proofs in ULeiOSH it is shown that the pro-nilsystem characteristic 
factors constructed by Host and Kra are precisely these minimal factors. < 

This very surprising theorem asserts that for a completely arbitrary er- 
godic Z-system X, its nonconventional averages Sn are entirely controlled 
by some highly-structured factor of X, which can be expressed in terms of 
the very concrete data of rotations on nilmanifolds. In this informal discus- 
sion we will assume familiarity with the definition and basic properties of 
such 'nilsystems' here; they are treated thoroughly in [HK05J and HZieOVII 
and the references given there. 

Host and Kra and Ziegler's proofs of the one-dimensional case of Theo- 
rem C proceed via two different approaches to Theorem |6.0.5[ They are both 
rather longer than the proof in our Chapter [3l but using Theorem 16.0.51 they 
give a much more precise picture of the limit. On the other hand, the strategy 
used in our Chapter [3] simply cannot be specialized to the one-dimensional 
setting: it is essential for our approach that the result be formulated for the 
linearly independent directions ei, 62, . . . , G U^. This is because even 
if we are initially given a Z-system (X, T., ix,T), we must re-interpret it as a 
Z'^-system in order to pass to an extension that is sated in the way required 
by Proposition 13.3. 1[ To do this we define a new Z^'-action T' on X by 
{T')*^^ := T\ but once we ascend to our sated extension this special struc- 
ture of a collection of powers of a single transformation will be lost, and so 
we can no longer focus on the special, one-dimensional case of convergence. 
In a sense, this quiet assumption of linear independence was a precursor to 



71 



the discussion of Section 14.11 we need the linear independence of the sub- 
groups Zej < if- in order that a corresponding notion of satedness has useful 
consequences. 

However, these two very different approaches to different cases of Theo- 
rem C do suggest a reconciliation of the issue raised at the end of Section STT} 
what becomes of our meta-question on the possibly joinings of Z^-systems 
Xj G Zq* if the subgroups Fj are not linearly independent? The centrepiece of 
this final chapter is a conjectural answer to this question. If true, it would offer 
the first step in a complete 'interpolation' between the structural result 16.0.51 
of Host and Kra and our much softer result l4~L2l 

In order to formulate our conjecture, we first need some more notation. 
The notion of an isometric extension of ergodic probability-preserving sys- 
tems and the fact that any such can be coordinatized as a skew-product ex- 
tension over the base system by some compact homogeneous space are very 
classical; see, for instance, Glasner's book [|Gla03ll . Here we will also assume 
familiarity with a natural but less common generalization of this theory to the 
case in which the base system is not necessarily ergodic, in which the fibres 
of our extension must be allowed to vary in a suitable 'measurable' way over 
the ergodic components of the base system. This theory is set up generally 
in HAuscl . where the lengthy but routine work of re-establishing all the well- 
known theorems from the ergodic case is carried out in full, and we will also 
adopt the basic notations of that paper. 

Definition 6.0.6 (Direct integral of pro-nilsystems). IfV is a discrete Abelian 
group then a T-system X = {X, S, fi, T) is a direct integral of k-step pro- 
nilsystems if it admits a tower of factors 



i > 1 can be coordinatized as a relatively ergodic extension by measurably- 
varying compact metrizable Abelian group data 




in which the action of V on Xq is trivial, each extension Xj 



Xi_i X (y4i,.,mA,,.,Cri) 




X 
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(so the measurable group data Ai^, really varies only over the base system 
Xgj and for each ergodic component fis of fi the resulting k-step Abelian 
distal ergodic T -system 

{X, S, fXs, T) = {Ai^s X A2,s X ■ ■ ■ X Ak,s, Borel, Haar, ai x 0-2 x ■ ■ • x o-fe) 

is measure-theoretically isomorphic to an inverse limit of actions ofV by com- 
muting rotations on k-step nilmanifolds. 

Remark In fact it seems likely that the above class of systems can be set 
up in several different ways, which will presumably turn out to be equivalent. 
I haven chosen the above definition here because I suspect it will ultimately 
prove relatively convenient for establishing the necessary properties of these 
systems, but an alternative has already appeared in the literature in the pa- 
per HCFHII of Chu, Frantzikinakis and Host. < 

The following lemma is now routine, given the ergodic case which is clas- 
sical (it follows from the nilmanifold case of Ratner's Theorem: see, for in- 
stance, flLeiOVilLelTOl ). 

Definition 6.0.7. If K < T is an inclusion of discrete Abelian groups, then 
the class Z^-j ^ of those V -systems whose A-subactions are direct integral of 
k-step pro-nilsystems is an idempotent class of V -systems. We refer to it as 
the class of A-partially k-step pro-nilsystems. □ 

We are now ready to offer our conjectural strengthening of Theorem 14. 1.21 
to the case of linearly dependent subgroups Fj. 

Conjecture 6.0.8 (General Structural Conjecture). Suppose that Fj < for 
i = 1,2, ... ,r are subgroups among which there are no pairwise inclusions 
and 111, n2, . . . , rir > are integers. Then depending on these data there are 
finite families of pairs 

(Ai,i,mi,i), (Ai,2,mi,2), • • • , {Ai^k^.m^k,) for i = 1,2, ... ,r 

such that each niij > is an integer and Aij < is a subgroup properly 
containing Tifor each i, j, and for which the following holds. 
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If^iE Zj^j'j n.for each i = 1,2, ... ,r and each Xj is sated with respect to 
all possible joins of classes of the form Z^^^^for T < 7L^ and n > 0, then for 
any joining tTj : Y — )■ Xj, % = 1,2, ... ,r, the factors 7rj^^(Sj) are relatively 
independent over their further factors 

-r'(V*«) 

where is the factor of generated by the factor map to {J^'VZ^y )Xj. 

Remark We invoke the 'no-inclusions' condition on the subgroups Fj in 
order to avoid degenerate cases. Without it, we might for example be asking 
for the collection of all possible joinings between two systems Xj G Zg' for 
i = 1,2 with Fi > and in this case Lemma l4.1.1l tells us something about 
the less constrained system X2, but on the side of the more constrained sys- 
tem Xi the joining may clearly be completely arbitrary. < 

In particular, the case in which Xj has trivial Fj-subaction corresponds 
to rii = 0, and in this case the above conjecture asserts that given enough 
satedness, the factors n^^ (Sj) of the joining system are relatively independent 
over some further factors, each of which is assembled as a join of systems 
from the classes Zq' fl Z^^j^^. . In particular, while each of these ingredients 
may not be partially invariant under any subgroup of strictly larger than 
Fj, for each them we do know something quite concrete (in terms of pro- 
nilsy stems) about the subaction of some properly larger subgroup Aij ^ Fj. 

Of course, the above conjecture does not strictly cover Theorem |4.1.2[ 
since that gives much more precise information on the pairs (Ajj, mj ,,) in 
case the Fj are linearly independent: to wit, the Aj ^ are the sums Fj + F^ 
for £ 7^ i, and niij = 0. While a final understanding of Conjecture 16.0.81 
would presumably also give a recipe for producing these pairs in the general 
case (and so would recover the exact details of our known special cases), 
the slightly incomplete formulation of Conjecture 16.0.81 seems ample for our 
present discussion, and as I write this any sensible guess as to its completion 
appears beyond reach. 

Indeed, by itself Conjecture 16.0.81 seems very optimistic, so it is worth 
mentioning some special cases of it beyond Theorem |4. 1 .2| for which we have 
some supplementary evidence. 
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Firstly, if D = 2, each Ui = and the Fj are pairwise linearly-independent 
one-dimensional subgroups Zv^ < Z^, then we can take a sensible guess at 
a more precise version of the above conjecture: that any joining of systems 
Xj G Zq' should be relatively independent over the maximal (r — l)-step pro- 
nilsystem factors Xj — > Znii,rXj. Indeed, this would simply correspond to 
the Host-Kra Theorem in the case of the Z^-system 

X := (x^s®^/,f) 

with f""^ ■=TxTx-- -xT and f""^ ■.= TxT^x---xT'', where now the sub- 
groups are Fj = Z(e2— «ei) and the coordinate projections vTj : X'^ — > X de- 
fine factor maps to suitable Fj-partially-invariant Z^-systems X,;, constructed 
from X as in the proof of Proposition l4.2.4l In fact, I strongly suspect that the 
methods of either HHKOSII or IIZie07ll could be adapted directly to proving this 
more general result on the possible joinings of such partially-invariant sys- 
tems. Other, similar results on possible joinings of partially-invariant systems 
that do not require any extensions but would correspond to further special 
cases of Conjecture 16.0.81 have appeared in Frantzikinakis and Kra HFKOSII 
(where nonconventional averages such as in our Theorem C are studied, but 
subject to some additional hypotheses on the individual ergodicity of several 
one-dimensional subactions), in Chu llChu091l and in Chu, Frantzikinakis and 
Host HCFHII . In each of these cases, the joining in question has been either 
the Furstenberg self-joining of some tuple of commuting transformations, or 
the related Host-Kra self -joining (originally defined in [HK05] for the case of 
powers of a single transformation, and since adapted to the multi-directional 
case in | Hos09, Chu09. CFHI ). However, in each of these cases it seems likely 
that the methods employed could be adapted to proving a corresponding in- 
stance of Conjecture 16.0. 81 

Another special case of Conjecture 16.0. 8[ the first beyond Theorem 14. 1.21 
that does require an ascent to sated extensions, appears in [Ausdl lAuseL In- 
deed, the principal structural result of HAuseH can be phrased as asserting that 
if pi, p2 and pa G Z^ are three directions which together with the origin 
G Z^ lie in general position, then for a sufficiently sated system X the 
Furstenberg self -joining /i^ of the quadruple of transformations id, T^^, 
T^'^ is such that the coordinate projections ttj : X^^'^'"^'^^ — > X are relatively 
independent over their further factors 
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and 



7rri(S^- V S^-=^^^ V E^-=^- V S^,,,) for j, k} = {1, 2, 3}. 

Arguing again as for Proposition l4.2.4[ this would follow from a special case 
of Conjecture 16.0.81 (again with some more precise information on the pairs 
(Aj j, mij)) when D = 3, r = A and Fq, Fi, F2, F3 are four one-dimensional 
subgroups of any three of which are linearly independent. 

At present no proof (or disproof) of Conjecture l6. 0.81 seems to be at hand. 
Nevertheless, the various cases mentioned above do give me hope for it, 
and I strongly suspect that any result as powerful as this would constitute 
a major addition to our toolkit for approaching questions of multiple recur- 
rence. For example, I would expect it to shed considerable new light on the 
Bergelson-Leibman conjecture on the convergence of 'polynomial' noncon- 
ventional averages [IBL02L For a recent discussion of these latter question 
see HAusdi lAuseL where the proof of an instance of this latter conjecture was 
the original motivation for the result on joint distributions mentioned above. 
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